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Series Preface 


Soun is noght but airy-broke 

— Geoffrey Chaucer 
end of the 14th century 


Traditionally, acoustics has formed one of the fundamental branches of physics. 
In the twentieth century, the field has broadened considerably and has become 
increasingly interdisciplinary. At the present time, specialists in modem acoustics 
can be encountered not only in physics departments, but also in electrical and 
mechanical engineering departments, as well as in mathematics, oceanography, 
and even psychology departments. They work in areas spanning from musical 
instruments to architecture to problems related to speech perception. Today, six 
hundred years after Chaucer made his brilliant remark, we recognize that sound 
and acoustics is a discipline extremely broad in scope, literally covering waves 
and vibrations in all media at all frequencies and at all intensities. 

This series of scientific literature, entitled Modem Acoustics and Signal Pro- 
cessing (MASP), covers all areas of today’s acoustics as an interdisciplinary field. 
It offers scientific monographs, graduate-level textbooks, and reference materials 
in such areas as architectural acoustics, structural sound and vibration, musical 
acoustics, noise, bioacoustics, physiological and psychological acoustics, speech, 
ocean acoustics, underwater sound, and acoustical signal processing. 

Acoustics is primarily a matter of communication. Whether it be speech or 
music, listening spaces or hearing, signaling in sonar or in ultrasonography, we seek 
to maximize our ability to convey information and, at the same time, to minimize 
the effects of noise. Signaling has itself given birth to the field of signal processing, 
the analysis of all received acoustic information or, indeed, all information in any 
electronic form. With the extreme importance of acoustics for both modem science 
and industry in mind, AIP Press, now an imprint of Springer- Verlag, initiated this 
series as a new and promising publishing venture. We hope that this venture will 
be beneficial to the entire international acoustical community, as represented by 
the Acoustical Society of America, a founding member of the American Institute 
of Physics, and other related societies and professional interest groups. 



viii Series Preface 


It is our hope that scientists and graduate students will find the books in this 
series useful in their research, teaching, and studies. As James Russell Lowell once 
wrote, “In creating, the only hard thing’s to begin.” This is such a beginning. 

Robert T. Beyer 
Series Editor-in-Chief 



Preface 


Concert hall acoustics can be thought of as the place where science and art meet. 
Imagine how many scientists have contributed to its development. The oldest 
known writing on the subject, which concerns theater acoustics in ancient Greece 
and Rome, dates from about 25 B.c. It describes advanced designs for better acous- 
tics, which involved digging holes between chairs and placing bronze vessels 
upside down in the holes according to mathematics-based music theory. New York 
Philharmonic Hall, which opened in 1962, was designed on the basis of temporal 
factors representative of reverberation time. The hall was not well received by 
the public, however, and was closed in 1978 for extensive renovation. The most 
important objective is to blend acoustics and music in such a way that each indi- 
vidual’s feelings harmonize with the hall’s acoustics. Concert hall acoustics has 
been my major area of study for approximately 30 years; the first 20 years were 
devoted to the physical and psychological approaches. The goal was to calculate 
the overall subjective preferences of audience members in each seat. The findings 
are summarized as follows (Ando, Y., Concert Hall Acoustics , Springer- Verlag, 
Heidelberg, 1985): 

( 1 ) A hall should be designed for only a certain type of music, because the sound 
quality depends on the effective duration of the autocorrelation function of 
music signals and the temporal acoustic factors of sound fields. Musicians 
seem to compose their music with acoustics in mind: pipe organ music was 
written for large spaces, such as Notre Dame Cathedral, and Mozart’s string 
quartets were intended to be heard in court salons. 

(2) The newly introduced spatial factor (IACC) is the most effective on subjective 
preference and subjective diffuseness among acoustic factors of sound fields. 

(3) The theory allows calculating the global preference of each seat at the design 
stage with four physical orthogonal factors of sound fields: the relative sound 
pressure level (LL), the initial time delay (ArO between the direct sound and 
the first reflection, the subsequent reverberation time r su b, and the magnitude 
of the interaural cross-correlation IACC. 

For the past 10 years, my focus has been on auditory— brain function and the 
individual. This research shows that the left cerebral hemisphere is associated with 
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the temporal factors, A t\ and T su b, and that the right cerebral hemisphere is acti- 
vated by the more spatial factors, IACC and LL. The information corresponding to 
subjective preference of sound fields is found in brain waves. Surprisingly, individ- 
ual differences in subjective preference appear in brain waves and are recognized 
mainly in temporal factors and LL, not in IACC. Individual differences in LL may 
be related to the hearing level. For different values of A t\ and ^sub*> significant 
individual preferences arise. It is most likely the result of difference in individual 
temporal activities of the brain. This evidence ensures that the basic theory of 
subjective preference may be applied to each individual preference as well. The 
other fundamental subjective attributes for sound fields can also be described by 
the theory, based on the auditory-brain model with correlation mechanisms and 
the cerebral-hemispheres specialization. 

To blend sound sources and sound fields in a concert hall, the sound fields must 
first be designed to maximize the average preference at each seat. Musicians must 
select music programs appropriate for the hall and performing positions on the stage 
that both maximize the ease of the performers and the preference of the listeners. 
A seat-selection system for satisfying individual preference was introduced at the 
Kirishima International Concert Hall in 1 994, and the first international symposium 
of experts in the art and science of sound, Music and Concert Hall Acoustics 
(MCHA), was held in May 1995. As described in this book, 106 participants took 
part in a test procedure for seat selection. 

I hope that the theory of incorporating temporal and spatial values for both levels 
of global subjective preferences and individual preference of sound fields can be 
generalized to blend nature, the built environment, and people. 


Yoichi Ando 
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Introduction 


A number of investigations have been on-going since my first book, Concert Hall 
Acoustics , was published in 1985. Typical examples are the study of hemispheric 
brain activities, in order to identity a model of auditory— brain systems, orthog- 
onal physical factors, and the theory of individual subjective preference. This 
book describes comprehensive concepts, theoretical backgrounds and subjective 
evaluations, in addition to subjective preference for the sound fields, as well as ap- 
plications in the design of concert and multiple-purpose halls. Particular emphasis 
has been placed on enhancing performance in the selection of the “most-preferred” 
seat for individuals in a hall. 

It is also of interest that such a theory may be applied in more general physical 
environments, such as those of light and heat, taking the spatial and temporal 
factors into account. 

This book is written for both undergraduate and graduate students in various 
fields including acoustics, psychology and physiology, and musical art, as well 
as professionals in architecture, engineering, and “sound coordinators” of concert 
halls and theaters. Readers, who are interested in the applications of designing 
concert halls and theaters, as well as electroacoustic systems, are recommended to 
read, first of all, Chapters 1 0 and 1 1 , and then proceed to the guidelines described 
in Chapters 4, 6, 7, and 8. 

Special attention is given to the process obtaining scientific results, and a model 
of the auditory— brain system, rather than describing only a final design method. 
Such processes may therefore help researchers who are interested in fusing science 
and art to become aware of the appropriate lines for future work. 


Y. Ando, Architectural Acoustics 
© Springer- Verlag New York, Inc. 1998 


1 



2 


Short Historical Review for Acoustics 
in a Performing Space 


Investigations in architectural acoustics to find dimensional and orthogonal factors 
influencing the subjective evaluation of sound fields go as far back as Vitruvius 
(ca. 25 B.C.). In the “ancient architectural acoustics,” the concepts of reverberation, 
interference, echo disturbance, and clarity of voice were described. Vitruvius’s 
remarkable statement that “bronze vessels, which were tuned notes of the fourth, 
the fifth, and so on,” by the calculation due to musical theory, were inverted in 
niches, and supported on both sides facing the stage by wedges not less than half a 
foot high. Niches in between the seats of the theater were constructed. Obviously, 
a great deal of scientific effort was attempted and much attention was devoted at 
that time in the space under our ears. In this book, the acoustic design of the floor 
structure and seating will also be discussed. 

Because of the lack of electroacoustic techniques, far fewer investigations on 
architectural acoustics than the ancient ones were reported between the first and 
nineteenth centuries. 

In 1857, Henry first mentioned the concept of the impulse, which is utilized in 
the modern science. In his case, a single impulse from one tooth of a wheel is a 
noise, from a series of teeth in succession a continued sound; and if all the teeth 
are equally spaced, and the speed of the wheel is uniform, then a musical note is 
the result. Further, he suggested factors that might be related to good acoustics 
indicating the following conditions: 

( 1 ) the size of the room; 

(2) the strength of the sound or intensity of the impulse; 

(3) the position of the reflecting surfaces; and 

(4) the nature of the material of the reflecting surfaces. 

It is interesting to note that these conditions are, to some extent, related to the four 
orthogonal factors that are described in this book. 

Sabine (1900) initiated the science of architectural acoustics, discovering re- 
verberant sound, and the formula to quantify reverberation time. The first careful 
experiment on the absolute rate of decay was in the lecture-room of the Boston 
Public Library, a large room. On the platform four organ pipes were placed, 
all of the same pitch, each with its own wind supply, and each having its own 
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electropneumatic value. Thus, one, two, three, or four pipes might start and stop 
at once. Then, the minimum audible times were measured. The corresponding du- 
rations of audibility, named t \ , ? 3 , and u are: 8.68 s, 9.14 s, 9.36 s, and 9.55 s, 

respectively. The time differences were obtained as follows: 

h — t\ = 0.45 [s], 

ti-n — 0.67 [s], (2.1) 

r 4 - t\ - 0.86 [s]. 

The exponential decay rate of the intensity can be obtained by the use of these 
time differences. Consequently, he finally derived the well-known reverberation 
formula 


r 60 - KV/A, (2.2) 

where AT is a constant ( = 0.159) for the velocity of sound at 342 m/s, V is the 
volume of the room in cubic meters, and A is the absorbing power of the room. 

Sabine recognized the Chapel of the Union Theological Seminary, New York 
City, as a very satisfactory example, without any explanation (Sabine, 1912). Con- 
sidering the fact that the shape of the ceiling in the chapel is like the bottom of a 
boat, which effectively decreases the IACC as described in Section 8.2, it seems 
that Sabine unconsciously noticed the importance of the spatial shape of a room. 

Knudsen (1929) suggested that the optimal reverberation time for speech is 
shorter than that for music. At the same time, MacNair (1930) recommended a 
longer reverberation time in the low-frequency range to supplement the loudness 
of music. 

Bekesy (1934) reported that a courtyard sound field, as shown in Figure 2.1, 
was much better than any of the sound fields in existing halls he had experienced. 
This clearly suggested the importance of side-wall reflections (Figure 7.9). 

In 1949, Haas investigated the echo disturbance effects by adjustment of the 
delay time of the early reflection by moving the head-positions of a magnetic tape 
recorder. He showed the disturbance of speech echo to be a function of the delay 
time and, as a parameter of the amplitude, Bolt and Doak (1950) later proposed 
the percent disturbance of echoes. 

Considering the fact that living creatures emerged and evolved in physical envi- 
ronments, including acoustic, visual, and thermal environments, our sensing organs 
and brain are greatly influenced by the physical environmental factors that existed 
before emergence. In Table 2.1, since 1960, physical factors found by several 
authors, which significantly influence subjective attributes, are listed. After inves- 
tigation of a number of existing concert halls throughout the world, Beranek ( 1 962) 
proposed a rating scale with eight factors of the sound field, from data obtained by 
questionnaire on existing halls, given to experienced listeners. Much attention has 
been given to the temporal factors of the sound field since Sabine’s discovery of 
reverberation theory. Clearly, the binaural effect was not satisfactory to listeners. 

Venekalasen and Christoff (1964) suggested the importance of reflections from 
the side walls. West (1966) found the correlation coefficient for 2 H/W ( H is 
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Figure 2.1. A courtyard with superior acoustics (Bekesy, 1934; Bekesy, 1967). 


Table 2. 1 . Significant physical factors found for sound fields, both existing and simulated, from 
systematically subjective judgment tests. 


Author 

Year 

Subjective 

judgment 

Sound system 

Number 

of 

factors 

Objective and subjective 
significant factors found 
and/or proposed 

Beranek* 

1962 

Questionnaire 

Listening in existing halls 

8[6] + 

( 1 ) Initial time delay; 

(2) Loudness; 

(3) Reverberation time; RT; 

(4) Frequency characteristic 
of RT; and others. 

Keet 

1968 

Apparent 
source width 

Simulated 

2 

(1) SPL; 

(2) ICC.* 
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Table 2. 1 . 

cont . 





Author 

Year 

Subjective 

judgment 

Sound system 

Number 

of 

factors 

Objective and subjective 
significant factors found 
and/or proposed 

Barron 

1971 

Spatial 

impression 

Direct sound and early reflections 
simulated 

2 

(1) SCC; 

(2) SPL; 

(3) Spectrum. 

Damaske and 
Ando 

1972 

Subjective 
diffuseness and 
direction of 
sound source 

Dummy head and loudspeaker 

2 

(1) I ACC; 

(2) r 1A cc- 

Yamaguchi 

1972 

Dissimilarity 

Two microphones and headphones 

3[2] 

(1) SPL; 

(2) Frequency characteristic 
of RT. 

Edward 

1974 

Dissimilarity 

Dummy head and headphones 

3 

(1) RT; 

(2) Volume level; 

(3) Early echo pattern. 

Schroeder, 
Gottlob, and 
Siebrasse 

1974 

Preference by 

paired 

comparison 

Dummy head and loudspeakers 

4[2] 

(1) RT; 

(2) IACC. 

Ando 

1977 

Preference by 

paired 

comparison 

Loudspeakers simulation for direct 
sound and the first reflection 

2 

(1) Initial time delay; 

(2) IACC. 

Ando 

1983 

Preference by 

paired 

comparison 

Loudspeaker simulation 

4 

(1) Listening level; 

(2) Initial time delay; 

(3) Subsequent 
reverberation time; 

(4) IACC. 

Cocchi, 
Farina, and 
Rocco 

1990 

Preference 

Real sound field in a hall 

4 

(1) Listening level; 

(2) Initial time delay; 

(3) Subsequent 
reverberation time; 

(4) IACC. 

Sato, Mori, 
and Ando 

1997 

Preference by 

paired 

comparison 

Real sound field switching source 
positions (loudspeakers) at fixed 
seats in a hall 

4 

(1) Listening level; 

(2) Initial time delay; 

(3) IACC; 

(4) iiacc- 


* Numbers in brackets indicate the numbers of dimensions that may be regarded as significant factors. 
f Beranek ( 1 996) later proposed six factors, but two added factors are questionable in their orthogonality. With 
regard to the frequency characteristics of the reverberation time, the range below 500 Hz is not critical in 
preference judgments, so that the preferred range is broad (Ando, Okano, and Takezoe, 1989). 

+ Short-term cross-correlation coefficient. 
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the height and W is the width of a hall) and a numerical scale of subjective 
categories to be 0.71. Damaske (1967/68) investigated subjective diffuseness by 
arranging a number of loudspeakers around the listener. Keet (1968) reported 
the variation of apparent source width (ASW) in relation to the interaural cross- 
correlation coefficient and the sound pressure level. Marshall ( 1 968a, b) stressed the 
importance of early lateral reflections of just 90°, and Barron (1971) investigated 
“spatial impressions” or “envelopement” of sound fields in relation to the inter- 
aural cross-correlation coefficient. Damaske and Ando (1972) defined the IACC 
as the maximum absolute value of the interaural cross-correlation function within 
the possible maximum interaural delay range for humans such that 

IACC = |0/ r (r)| max for |r| < 1ms, (2.3) 

and proposed the method of calculating the interaural cross-correlation function 
for the sound fields. 

By dissimilarity tests, Yamaguchi (1972) reported that the sound-pressure level 
and the frequency characteristics are significant factors for a sound field recorded 
in a hall. Edward ( 1 974) also tested the dissimilarity of recorded sound fields, and 
reported as important factors the early-echo pattern as well as RT and volume level. 
Schroeder, Gottlob, and Siebrasse ( 1 974) reported results of the paired-comparison 
tests asking which of two sounds of listened music were preferred. Sound fields 
were reproduced at each ear of a listener in an anechoic chamber, through dummy 
head recording and two loudspeaker systems with filters reproducing spatial in- 
formation. They found two significant factors, RT and IACC, having a strong 
influence on subjective preference. Wilkens (1977) claimed that significant sub- 
jective attributes were perception of strength and extension of sound source, as 
well as perception of clarity and tone color. 

Ando and Kageyama investigated subjective preference in relation to physical 
factors, which were calculated from the mathematical expression for sound arriving 
at both ears (Ando, 1977; Ando and Kageyama, 1977). In 1983, Ando published a 
theory of subjective preference in relation to the four orthogonal physical factors 
for a sound field, enabling the calculation of a scale value at each seat (see also 
Ando, 1985, 1986). This theory was first confirmed by Cocchi, Farina and Rocco 
(1990) in an existing hall. Sato, Mori, and Ando (1997) reconfirmed it more clearly 
by the paired-comparison judgments in an existing hall, switching the loudspeakers 
on the stage instead of changing seats. They introduced the interaural delay of the 
IACC, tiacc, for the image shift of the sound source that is to be avoided or for 
the balance of the sound field. 

Thus far, this theory has been based on the global subjective attributes for a 
number of subjects. In order to further enhance each individual’s satisfaction, the 
theory may be applied by adjusting the weighting coefficient of each orthogonal 
factor (Ando and Singh, 1 996; Singh and Ando, unpublished), even though a certain 
amount of inter-individual differences exist (Sakai, Singh, and Ando, 1997). The 
seat selection system (Sakurai, Korenaga, and Ando, 1 997), which was introduced 
after construction of the Kirishima International Concert Hall, is a typical example 
of this application (Ando and Setoguchi, 1995). 
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Physical Properties of Source Signals 
and Sound Fields in a Room 


Sound signals proceed along auditory pathways and are perceived in a time se- 
quence, and the meanings of the signals are simultaneously interpreted by the brain. 
Thus, a great deal of attention is paid here to analyzing signals in the time domain. 
This chapter treats mainly the autocorrelation function (ACF) of the signal, which 
contains the envelop and its fine structure as well as the power at the starting time. 
The ACF has the same information as the power density spectrum of the signal 
under analysis, but the ACF differs greatly from the spectrum insofar as the signal 
processing in the auditory— brain system and the related subjective attributes for 
the sound field are concerned. 

3.1. Analyses of Source Signals 

3.1.1. Power Density Spectrum 

Let us first discuss signal analysis in the frequency domain in terms of the power 
density spectrum of a signal p(t), which is defined by 

PAco) = P(co)P*(co), (3.1) 

where P(co) is the Fourier Transform of pit), given by 

1 f +0 ° 

P(co) = — / p(t)e~ JM dt. (3.2) 

J _oo 

and the asterisk denotes the conjugate. 

The inverse Fourier Transform is the original signal p(t)\ 

/ -boo 

P(co)e ja ” dco. (3.3) 

-00 

In considering the sharpening effects existing in the auditory system, the required 
sharpness of filters used in hearing tests must have slope characteristics of more 
than 1000 dB/octave. This will be examined under the Results of loudness in 
Section 6.3. 
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3.1.2. Long-Time Autocorrelation Function (ACF) of a 
Sound Source 


One of the most promising signal processes in the auditory system is the ACF, 
which is defined by 

1 [ +T . 

<b P (r)= lirn — / p (t)p (t + r) dt, (3.4) 

T->oc II J_ T 

where p\t) = p(t) *s(t),s(t) being the ear sensitivity. For practical convenience, 
s(t) may be chosen as the impulse response of an A -weighted network. Also, the 
ACF can be obtained from the power density spectrum, which defined by Equation 
(3.1), so that 


= 

r+oo 

I P ( j(Q))ej (ot da). 

(3.5) 

Pd (.“>) = 

< -OQ 

P +00 

/ Q d {T)e- jan dz. 

’ — oc 

(3.6) 


Thus, the ACF and the power density spectrum mathematically contain the same 
information. In the ACF analysis, there are three significant parameters, namely, 

(1) the energy represented at the origin of the delay <J>y,(0); 

(2) the effective duration of the envelope of the normalized ACF, z e , which is 
defined by the ten-percentile delay, representing a kind of repetitive feature or 
reverberation contained within the source signal itself; and 

(3) the fine structure, including peaks with its delays and the zero crossing number. 


The normalized ACF is defined by 


4>p(*) 


* P (r) 

0 ) 


(3.7) 


Examples of analyzing the normalized ACF (2 T = 35 s) for the two extreme 
music motifs listed in Table 3.1 are shown in Figure 3.1. 

When p'(t) is measured with reference to the pressure 20 /xPa leading to the 
level L(t), the equivalent sound pressure level L eq , defined by 


L 


eq 



L(t) 

10 '0 dt 


(3.8) 


corresponds to 


10 log <*>,«)). (3.9) 

This is an important factor related to loudness, but it is not the whole story. The 
envelope of the normalized ACF is also related to important subjective attributes, 
as will be detailed in the subsequent chapters. 

A good example of applying the ACF is in the discussion of the missing funda- 
mentals of music signals. When the signal contains only a number of harmonics 
without the fundamental frequency, we hear the fundamental frequency as a pitch 
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Table 3. 1 . Music and speech source signals and their effective duration of the ACF, r e . 


Sound source* 

Title 

Composer or writer 

T e f 

[ms] 

(^)min + 

Music Motif A 

Royal Pavane 

Orlando Gibbons 

127 

(127) 

125 

Music Motif B 

Sinfonietta, Opus 48; 

IV movement 

Malcolm Arnold 

43 

(35) 

40 

Music 

Motif B(L + R) 

Sinfomietta, Opus 48; 
IV movement 

Malcolm Arnold 



45 

Music Motif C 

Symphony No. 102 in 

B flat major; 

II movement 

Franz J. Haydn 


(65) 


Music 

Motif C(L + R) 

Symphony No. 102 in 

B flat major; 

II movement 

Franz J. Haydn 



70 

Music Motif D 

Siegfried Idyll; 

Bar 322 

Richard Wagner 


(40) 


Music Motif E 

Symphony in C major, 
K-V, no. 551, 

IV movement 

Wolfgang A. Mozart 

38 



Music Motif F 

§ 

Tsuneko Okamoto 

105 



Music Motif G 

§ 

Tsuneko Okamoto 

145 



Music Motif K 

Karesansui 

Hozan Yamamoto 

220 


35 

Speech S 

Poem read by 
a female 

Doppo Kunikida 

10 

(12) 



* The left channel signals of the original recorded signals (Burd, 1969) were used, and (L 4- R) 
indicates the accompanying right channel signal was mixed with the left channel signal containing 
the main melody. 

+ Values of i e differ slightly with different radiation characteristics of the loudspeakers used; thus 
all of the physical factors must be measured at the conditions of the hearing tests, IT = 35 s. 

* Recommended method: the value of (r^)min is obtained by the minimum value of short-moving 
ACFs, 2 T = 2 s, with the moving interval of 100 ms; 

§ Composed for preference judgments of alto-recorder soloists (Section 7.1). 




0 10 20 30 40 50 60 70 80 90 100 


(a) x [ms] 



0 10 20 30 40 50 

(b) x [ms] 

Figure 3.1. Examples of analyzing the long-time ACF {IT — 35 s). Effective duration of 
the normalized ACF is defined by the delay x e at which the envelope of the normalized ACF 
becomes 0.1. (a) Music Motif A; Royal Pavane, composed by Gibbons, x e — 127 ms; and 
(b) Music Motif B: Sinfonietta, Opus 48, IV movement, Allegro con brio, x e = 43 ms. Note 
that, according to the characteristics of the loudspeaker used in the subjective judgments, 
the effective duration of the ACF may differ slightly, for example, x e = 35 ms (Music 
Motif B). 
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i.o 

0.5 



_1 q l I I I I I 

0 10 20 30 40 50 


(a) x [ms] 



0 10 20 30 40 50 

(b) Time ■- [ms] 


Figure 3.2. (a) The normalized ACF of the harmonic components of 3/ 0 , 4/ 0 and 5/ 0 in 
which the missing fundamental (perceived pitch) is / 0 = 200 Hz; and (b) the real wave 
form has three harmonic components with the third harmonic of out-phase. 


(Wightman, 1973). This phenomenon is well explained by the ACF-fine structure 
shown in Figure 3.2 (Sumioka and Ando, 1996). The normalized ACF of only 
third, fourth, and fifth harmonics clearly contains the period of the fundamental 
frequency in the fine structure of the ACF as shown in Figure 3.2(a), but is not 
clear in the real sound signal in time as shown in Figure 3.2(b). In the auditory 
pathways, therefore, the sound signals are assumed to be processed by the ACF 
in the time domain. This cannot be explained by the spectrum in the frequency 
domain only (see also Shoda and Ando, 1996). 
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Other important subjective attributes of a sound field are best described based 
on the ACF of the source signals, as detailed in Chapters 4 through 7. 

3.1.3. Short-Time Moving Autocorrelation Function of a 
Source Signal 

It is interesting to consider the fact that, in producing a sound signal from any kind 
of musical instrument, the radiated sound comes from a nonlinear process. There- 
fore, it may produce special and various sound properties that seem to be known 
and utilized by many musicians in their performance, without any knowledge of 
acoustic science. This includes the phenomena of bifurcations and chaos, yield- 
ing, for instance, quasi-periodic oscillations that may come from a feedback with 
a time delay that occurs in most musical instruments (Gueth, 1987; Lauterbom 
and Parlitz, 1988). In the case of string instruments, the oscillation comes from 
coupling of the string with the resonance of the bridge in the time domain (Muller 
and Lauterbom, 1 996). Effects of such properties have been tackled by the method 
of analyzing the time series or the phase-space representation. 

Since a certain degree of coherence exists in the time sequence of the source 
signals, which may greatly influence subjective attributes of the sound field, use 
is made here of the short ACF as well as the long-time ACF. 

The short-time moving ACF as a function of time t is calculated as 

<M r) = (p P { x; f, T) 

= <Mr:r,n 

[<D,(0;f, T)<t> p (0;z + r, T)] 1 ^ ’ 

where 

1 f t+T , 

<M r ; c T ) = / P (s)p (S + r) ds 

Jt-T 

The normalized ACF satisfies the condition that <p p (0) = 1. 

In order to demonstrate a procedure obtaining the effective duration of the short- 
time ACF analyzed. Figure 3.3 shows the absolute value in the logarithmic form as 
a function of the delay time. The envelope decay of the initial and important part of 
ACF may be fitted by a straight line in most cases. The effective duration of ACF, 
defined by the delay r e at which the envelope of the ACF becomes — 10 dB (or 0. 1 ; 
the ten percentile delay), can easily be obtained by the decay rate extrapolated in 
the range from 0 dB, at the origin, to —5 dB. 

The short-time effective durations of the ACF for various signal duration 2T s 
with the moving interval are obtained in such a way. Examples of analyzing the 
moving ACF of the music motif K are shown in Figure 3.4(a) through (f). The sig- 
nal duration corresponding to the psychological present, as suggested by Fraisse 
(1982), is 2 T = 0. 5-5.0 s. Figure 3.5(a) through (f) shows the moving x e for 
music motifs A and B, 2 T = 2.0 s and 5.0 s. The psychological present defined 
here is a short time duration of stimuli needed for subjective responses. Since the 
minimum value of the moving x e is the most active part of each piece, containing 


(3.10) 


(3.11) 



Autocorrelation Function of Piano Signal with Varying Performing Style 1 3 




Figure 3.3. Examples of determining the effective duration of the running ACF (Music 
Motif K). (a) r c = 65 ms; and (b) r e = 100 ms. 


important information and influencing subjective responses for the temporal cri- 
teria as discussed in Chapter 6 (Ando, Okana, and Takezoe, 1989; see also Mouri 
and Ando, 1998), the values of ( T e ) mm are plotted in Figure 3.6 as a function of 2T . 

It is interesting that stable values of (r*) m j n may be obtained in the range of 
2 T = 0.5 to 2.0 s for these extreme music motifs. 


3.2. Autocorrelation Function of Piano Signal with 
Varying Performing Style 

One of the typical music sources, the piano signal, is selected here. In order to 
examine the behavior of the ACF of piano signals of varying performing styles, 
a piano was controlled by a computer for reproduction of the source signal, and 










Figure 3.5 a, b, c. Effective duration of the running ACF with a 1 00 ms interval as a function 
of the integration time, 2 T, of the source signals (30 s). (a) Music Motif A (Gibbons), 
2 T = 2 s; (b) Music Motif A (Gibbons), 2 T = 5 s; and (c) Music Motif B (Arnold), 


2T = 2 s. 





Figure 3 .5 d, e, f. Effective duration of the running ACF with a 1 00 ms interval as a function 
of the integration time, 27, of the source signals (30 s). (d) Music Motif B (Arnold), 
27 = 5 s; (e) Music motif K (Yamamoto), 27 = 2 s; and (f) Music Motif K (Yamamoto), 


27 = 5 
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2T [s] 


Figure 3.6. Minimum values of the effective duration of the ACF as a function of 2 T . 
(o): Music Motif A (Gibbons); (A): Music Motif B (Arnold); and (i): Music Motif K 
(Yamamoto). 


signals recorded in an anechoic chamber were analyzed (Taguti and Ando, 1997). 
As is described in Chapters 4 and 6, the effective duration of the ACF, r ei is the 
fundamental time unit of the sound field in a concert hall (e.g., see Figure 6.11). Of 
particular importance is how the played music fuses with the total sound field. The 
performance style is basically related to important subjective attributes, such as 
temporal factors of the sound field, and determines the most preferred initial time 
delay gap between the direct sound and the first reflection, and also the optimum 
subsequent reverberation time. If the effective duration of the ACF is varied by 
the performing style, then the musician may control it to approach the preferred 
temporal condition of the sound fields, for both performer and listeners to fit in 
with the performing music and the sound field. 

In piano performances, the effective duration of the ACF may be controlled by 
means of: 

( 1 ) speed of performance or tempo; 

(2) dynamics; 

(3) articulation; 

(4) synchronization; 

(5) pedaling; and 

(6) note-off tail. 

Typical examples of the ACF for changing styles of piano performance — 
staccato and legato — are shown in Table 3.2. As expected, a fast tempo results 
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Table 3.2. Various styles of piano performance and the effective 
duration of the ACF, x e . The music piece used is the opening eight bars 
of Exercise No. 1, Hanon Tempo, mm = 120 under constant dynamics. 


Style of performance 

NOD [ms] 

s [%]* 

x e [ms] 

Staccato 

50 

70 

61—87 

Legato 

125 

0 

106—170 

Super legato 

160 

-30 

170-233 

Mixed 

— 

— 

110-155 


* 5 = (IOI — NOD)/IOI, where IOI is the inter-onset interval and NOD is the 
note-on duration. 


in a short value of the effective duration of the ACF, x e \ and a slow tempo leads 
to a long value. The use of the damper pedal creates long values of the x e . The 
minimum values of x e correspond roughly to values of the note-onset duration 
(NOD) of a note as shown in Table 3.2. Staccato shortens the values of x e as the 
acuteness increases, but the values become no shorter than the minimum value 
of 60 ms. This lower limit may be caused by a mechanism in producing sound 
from the piano. So far, we have noted that the ACF, x e , of source signals may be 
controlled by changing the performing style. 


3.3. Sound Transmission from a Point Source 
to Binaural Entrances 


Let us consider the sound transmission from a source point in a free field to the 
binaural earcanal entrances. Let p(t) be the source signal as a function of time, r, 
and let gi(t) and g r (t) be impulse responses between the source point ro and the 
binaural entrances. Then the sound signals arriving at the entrances are expressed 
by 


flit) = p(t) * g/(f), 

frit) = Pit) * grit). 


(3.12) 


where the asterisk denotes convolution. 

The impulse responses g\, r (t) include the direct sound and reflections w n (t — 
A t„) in the room as well as the head-related impulse responses h n \ r (t), such that 


00 

<?/.r(0 — ^ ^ A n w n {t At n ) * h n / r (t), (3.13) 

n=0 

where n denotes the number of reflections with horizontal angle i= n and elevation 
rj„, n — 0 signifies the direct sound (§ 0 = 0 , rjo = 0); A 0 wo(t - A r 0 ) = <5(0, 
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Ato — 0, Ao = 1, 8(t) being the Dirac delta function, and A n is the pressure 
amplitude of the nth reflection n > 0; w n (t) is the impulse response of the walls 
for each path of reflection arriving at the listener, A t„ being the delay time of 
reflection relative to that of the direct sound, and h n! r (t) are the impulse responses 
for diffraction of the head and pinnae for the single sound direction of n. Therefore, 
Equation (3.12) becomes 

PC 

ft At) = £>(0 * A„w„(t - At,,) *h nLr (t). (3.14) 

n= o 

If the source has a certain directivity, p(t) is replaced by p n (t). 


3.4. Physical Factors of Sound Field 


3.4.1. Temporal-Monaural Criteria 


As far as the auditory system is concerned, all factors influencing any subjective 
attributes must be included in the sound pressures at the binaural entrances, these 
are expressed by Equation (3.14). 

The first important parameter which depends on the source program is the sound 
signal p(t). This is represented by the ACF defined by Equation (3.4). The ACF 
is factored into the energy of the sound signal d> /? (0) and the normalized ACF 
as expressed by Equations (3.4) through (3.7). The normalized ACF includes its 
envelope, represented by r e , peak-amplitudes with the delays, and the zero-crossing 
number monaural criterion. 

The second parameter is the set of impulse responses of the reflecting walls, 
A n w n (t — A t n ). The amplitudes of reflection relative to that of the direct sound 
A] , A 2 , . . . are determined by the pressure decay due to the paths d n , such that 



where do is the distance between the source point and the center of the listener’s 
head. The impulse responses of reflections to the listener w n (t — A t n ) 9 with delay 
times of At \ , At 2 , . . . relative to that of the direct sound, is given by 


A t n — 


d n do 


c 


(3.16) 


These parameters are not physically independent; in fact, the values of A n are 
closely related to A t n in such a manner that 


a do(l/A n - 1) 

A t„ = 

c 


(3.17) 


In addition, the initial time-delay gap between the direct sound and the first reflec- 
tion At\ is statistically related to A t 2 , A r 3 , . . . , which depend on the dimensions 
of the room. In fact the echo density is proportional to the square of the time delay 
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(Kuttruff, 1991). Thus, the initial time-delay gap At\ is regarded as a representation 
of both sets of A t n and A fl (n = 1,2,...). 

Another parameter is the set of the impulse responses of the nth reflection, w„ ( t ) 
being expressed by 

w n (t) - w n (t) (U * w„(t) (2> * ■ ■ ■ * w„{t) u) , (3.18) 

where w n (t) {l) is the impulse response of the jth wall existing in the path of the 
nth reflection from the source to the listener. 

Such a set of impulse responses w n (t) {l) may be represented by a statistical decay 
rate, namely the subsequent reverberation time, T su b, because w n (r) (/) includes the 
absorption coefficient as a function of frequency. This coefficient is given by 

a n (co) {,) = 1 - |W fl (<u) <i) | 2 . (3.19) 


According to Sabine’s formula (1900), the subsequent reverberation time is 
approximately calculated by 


KV 


aS 


(3.20) 


where K is a constant (about 0.162), V is the volume of the room, S is the total 
surface, a is the average absorption coefficient of the walls, and a S is given by the 
summation of the absorption of each surface i , so that 

aS = '£ t a(a>) ii) S U) . (3.21) 


3.4.2. Spatial-Binaural Criteria 

Two sets of head-related impulse responses for the two ears h n ( r (t) constitute 
the remaining objective parameter. These two responses h nl (t) and h nr (t) play an 
important role in sound localization and spatial impression, but are not mutually 
independent objective factors. For example, h„j(t) — h nr (t) in the median plane 
(£ = 0°) and there are certain relations between them for any other directions to 
a listener. In fact, a certain relationship between the IATD and the IALD can be 
expressed for a single directional sound arriving at a listener for a given source 
signal, and thus for any sound field with multiple reflections. A particular example 
is that, when the IATD is zero, then the IALD is nearly zero. 

Therefore, to represent the interdependence between two impulse responses, 
a single factor may be introduced, i.e., the interaural cross-correlation function 
between the sound signals at both ears f[(t) and f r (t), which is defined by 

1 f +T 

<t>i r (z) = hm — / -F t) dt, |r | < 1 ms, (3.22) 

T-+oo 21 J — j 

where //(r) and f f r (t) are approximately obtained by signals //, r (0 after passing 
through the A -weighted network, which corresponds to the ear sensitivity, s(t). 
The ear sensitivity may be characterized by the external and the middle ear as 
described in Section 5.1. 
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The normalized interaural cross-correlation function is defined by 


<M r) 


<E/r(r) 

V<M0)<fi rr (0) ’ 


(3.23) 


where d>//(0) and O rr (0) are autocorrelation functions at r = 0 for the left and 
right ear, respectively, or the sound energies arriving at both ears. 

Also, the denominator of Equation (3.23), 

y<M0)4>„(0) (3.24) 


is the geometrical mean of the sound energies arriving at the two ears. 

If discrete reflections arrive after the direct sound, then the normalized interaural 
cross-correlation is expressed by 






^^(T) 


-o^ 2 <(0)£,to A 2 ^ 


E h 

n 


(0) 


(3.25) 


where we put w„(t) — S(t), and ^^(r) is the interaural cross-correlation of the 
nth reflection, and <J>J." } (0) are the respective sound energies arriving at the 
two ears from the nth reflection. The denominator of Equation (3.25) indicates the 
geometric mean of the sound energies at the two ears. 

The magnitude of the interaural cross-correlation is defined by 


I ACC = |0 /r (r)| max 


(3.26) 


for the possible maximum interaural time delay, say, 


| r | < 1 ms. 


When the sound source is located at any horizontal angle £ relative to the frontal 
direction to a listener’s head, and the bandpass noise, after passing through an ideal 
filter with upper and lower frequencies of /2 and f \ , is radiated from the source 
location, then the interaural cross-correlation function and the autocorrelation 
function at r = 0 are given by 


*lr(T) = H, r 

<M0) = H lh 
dv r (0) = H rr , 


Aco(t — r $) 


sin 


Aco(t — T%) 


cos 


Aoj c (t - z >) 


(3.27) 


where N/ r is the cross power of the bandpass noise, Hu and H rr are the auto powers 
at the two ear entrances, is the maximum interaural delay depending on and 


Aw c = 2jr(/ 2 + /i), 
Aw = 2 tt(/ 2 - /i). 


(3.28) 


For the calculation of the sound fields with any spectrum signal, the needed data 
have been reported by Nakajima, Yoshida, and Ando ( 1 993); see also Ando ( 1 985). 
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The interaural delay time at which the IACC is defined, as shown in Figure 3.7, 
is the tjacc- Thus, both the IACC and ri A cc may be obtained from the condition 
of 

90/r(r) _ 
dr 

In the simple sound field described by Equation (3.27), the ri A cc corresponds to 
the interaural time delay for the horizontal angle £ defined by r$ . When tjacc 
is zero (one of the preferred conditions), then usually a frontal sound image and 
well-balanced sound field are perceived. 

The width of the interaural cross-correlation function defined by the interval of 
delay time at a value of S below the IACC, corresponding to the JND of the IACC, 
is given by the W\&cc (Figure 3.7). Thus, the apparent source width (ASW) may 
be perceived as a directional range corresponding mainly to the Wiacc- For the 
sound field with tjacc = 0, for example, the term of sin Z/Z in Equation (3.27) 
is nearly unity, because Z = Acur/2 is small enough, so that 


Wiacc 


4 

A<w t . 


cos 


8 

IACC 


(3.29) 


A well-defined directional impression corresponding to the interaural time delay 
^iacc is perceived when listening to sound with a sharp peak in the interaural 
cross-correlation function with a small value of Wiacc- On the other hand, when 
listening to a sound field with a low value of IACC < 0.15, then subjectively 
diffuse sound is perceived. 

Therefore, these four factors, the geometric mean of sound energies at the two 
ears, the IACC, tiacc? and Wjacca are independently related to the space oriented 
subjective attributes, for example, the subjective diffuseness, the image shift, and 
the ASW. Further discussion in this area appears in Section 6.1. 


3.5. Simulation of Sound Field 

According to Equation (3.14), sound fields in a room may be simulated by taking 
the directional information of its sound source and of its early reflections into 
consideration (Ando, 1985). 

An example of the block diagram of the simulation system for the direct sound 
and two early reflections and diffused reverberation is shown in Figure 3.8 (Ando 
et al., 1973), which was used in all subjective judgments experiments and in 
recording the electro-physiological responses described in this book. In order 
to realize a small value of IACC, the directions of four loudspeakers for sub- 
sequent reverberation (Rev.) are chosen about 55° from the median plane, and 
the incoherent reverberation signals supplied to the loudspeakers were delayed by 
Ar y (j = 1,2,3). 
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- 1.0 - 0.5 0 0.5 1.0 

Left-ear signal delayed Right-ear signal delayed 


I — [ms] 


Figure 3.7. Definitions of the IACC, r JA cc and VF 5A cc for the interaural cross-correlation 
function. 
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Reverberation 
free signal , p[t) 



Anechoic chamber 


Figure 3.8. A simulation system of sound fields. 
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Subjective Preference as an Overall 
Impression of the Sound Field 


A difficulty arises in the investigation of subjective attributes for the sound field in 
a room, because the sound field consists of a great number of reflections. However, 
the fundamental attributes are contained in the simplest sound field, which consists 
of the direct sound, and a single reflection as representative of a set of reflections. 
The first part of this chapter describes the results of subjective preference studies 
in relation to the temporal factor, At \ , and the spatial factor, IACC, of the sound 
field. Then, the orthogonal properties of the four acoustic factors are described for 
the sound field in the room, including the reverberation time. 

After obtaining optimum design objectives, the theory of subjective preference 
is derived. Based on this theory, an example of calculating subjective preference 
at each seat is demonstrated. In order to examine the theory, subjective preference 
judgments were conducted in an existing hall without the subjects changing seats 
in a paired-comparison judgment. 


4.1. Subjective Preference of the Simple Sound Field 

4.1.1. Preferred Delay Time of a Single Reflection 

The sound field consists of the direct sound £o = 0 ° (770 — 0°) and a single 
reflection from a fixed direction $\ = 36°(rji = 9°). These angles were selected 
since they are typical in a concert hall. The delay time Ati was adjusted in the 
range of 6 ms to 256 ms. Paired comparison tests were performed for all pairs in an 
anechoic chamber using nonnal hearing subjects with two different music motifs 
A and B (Table 3.1). The normalized scores of the sound fields as a function of the 
delay are shown in Figure 4.1 (Ando, 1977; see also Kang and Ando, 1985). 

Obviously, the most preferred delay time, with the maximum score, differs 
greatly between the two motifs. When the amplitude of reflection A\ = 1 , the most 
preferred delays are around 130 ms and 35 ms for motifs A and B, respectively. It 
is found that this corresponds to effective durations of the ACF of source signals 
of 127 ms (motif A) and 35 ms (motif B). 
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Figure 4. 1 . Preference scores of the sound fields as a function of the delay, A\ = 1 (6 sound 
fields and 13 subjects (Ando, 1977)). Preference scores are directly obtained for each pair 
of sound fields by giving + 1 and — 1 , corresponding to the positive and negative judgments, 
respectively. Also, the normalized score is obtained by accumulating the scores for all sound 
fields (F) tested and all subjects (S), and then dividing by the factor S(F — 1). A: Music 
Motif A, Royal Pavane by Gibbons, r e = 127 ms; and B: Music Motif B, Sinfonietta, Opus 
48, III movement, by Malcolm Arnold, r e = 35 ms. 


After inspection, the preferred delay is found roughly at a certain duration of 
the ACF, defined by r p , such that the envelope of the ACF becomes 0. 1 A i . Thus, 
r p — r e only when A \ = 1.0. The data collected as a function of the duration r p 
are shown in Figure 4.2, where data from a continuous speech signal of r e = 1 2 ms 
are also plotted (see also Section 6.2.3). When the envelope of the ACF is 
exponential, then it is expressed approximately by (Ando, 1985) 

Tp = [At]]p % [1 - log 10 A\]z e . (4.1) 

It is worth noticing that the amplitude of reflection relative to that of the direct 
sound should be measured by the most accurate method, for example, the square 
root of the ACF at the origin of the delay time. 

Two reasons can be given for explaining why the preference decreases for the 
short delay range of reflection, 0 < Afj < r p (Figure 4.3): 
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Figure 4.2. Relationship between the preferred delay of a single reflection and the duration 
of the ACF such that \(j> p (r ) | envelope = O.M|. Range of the preferred delays are graphically 
obtained at 0. 1 below the maximum score. A, B, and S refer to Music Motif A, Motif B, and 
speech, respectively. Different symbols indicate the center values obtained at the reflection 
amplitudes of +6 dB (o), 0 dB (•), and —6 dB (□), respectively (13 to 19 subjects). 


(1) tone coloration effects occur because of the interference phenomenon in the 
coherent time region (Section 6.2.3); and 

(2) the IACC increases when is near 0. 

On the other hand, echo disturbance effects can be observed when A t\ is greater 
than t p . 


4.1.2. Preferred Direction of a Single Reflection 

The delay time of the reflection, in the experiment showing the preferred direc- 
tion of a single reflection, was fixed at 32 ms. The direction was specified by 
loudspeakers located at £ 0 = 0 °(^ 0 = 27°) and £| = 18°, 36°, . . . , 90°(^i = 
9°). 

Results of the preference tests are shown in Figure 4.4. No fundamental dif- 
ferences are observed between the curves of the two motifs, in spite of the great 
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Figure 4.3. Subjective attributes before and after the preferred delay time of reflection 
[Ar 1 ] / ,(= r p ). 


difference of r e . The preferred score increases roughly with decreasing I ACC. The 
correlation coefficient between the score and the IACC is —0.8 (at 1% significance 
level: p < 0.01). The score with motif A at %\ = 90° drops to a negative value, 
indicating that the lateral reflections, coming only from around £i = 90°, are 
not always preferred. The figure shows that there is a preference for angles less 
than §] = 90°, and on average there may be an optimum range centered at about 
= 55°. Similar results can be seen in the data from speech signals (Ando and 
Kageyama, 1977; see also Section 6.4). 


4.2. Orthogonal Properties of Acoustic Factors 

In order to examine the independence of the effects of the four physical factors on 
subjective preference judgments and to make continuous the linear scale value of 
preference for any sound field, two of the four factors were varied simultaneously 
while the remaining two were held constant. For convenience, sound fields in a 
concert hall with the same plan as the Symphony Hall in Boston, as shown in 
Figure 4.5(a) were simulated. The system simulating the sound fields in concert 
halls is shown in Figure 3.8 of the previous chapter. A computer program provides 
the time delay of two early reflections (n = 1,2) and the subsequent reverberation 
(. n > 2), relative to the direct sound (Figure 4.5(b)). In order to represent the 
geometrical size of a similar room, the scale of dimension (SD) is introduced 
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Figure 4.4. Preference scores and the magnitude of the IACC of the sound fields as a 
function of the horizontal angle of a single reflection, A\ = 0 dB (6 sound fields and 13 
subjects). 


as follows: 

A ti = 22 (SD), A t 2 = 38 (SD), A t 3 = 47 (SD) [ms] (4.2) 

The reverberation signal with constant frequency characteristics was generated 
by the Schroeder Reverberator (Schroeder, 1962). To obtain a natural sound, the 
conditions of the simulation system were carefully selected. 

Test A. In order to examine whether or not the temporal -monaural factors influ- 
ence the scale values of subjective preference independently, paired-comparison 
tests of the 1 6 sound fields for each source signal were conducted for changes of 
SD and T sub with 9 to 14 subjects (Ando, Okura, and Yuasa, 1982). Both of the 
factors are closely associated with the left hemisphere of the human brain, as is 
discussed in Sections 5.2 and 5.3. 

Test B. In order to determine the independent influence of the spatial factors, LL 
and IACC, paired-comparison tests of the 12 sound fields for each source signal 
were performed with 1 3 to 14 subjects (Ando and Morioka, 1981). Both factors are 
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Figure 4.5. (a) A sound field in a concert hall with a plan similar to that of the Symphony 
Hall, Boston; and (b) amplitude decay of early reflections and subsequent reverberation 
simulated for subjective preference tests. 
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closely associated with the right hemisphere of the human brain, as is discussed 
in Section 5.2. 

Test C. For reconfirmation of the independence of the left and right hemispheric 
factors, 7 sub and I ACC, preference tests were conducted for 16 sound fields with 
8 subjects (Ando, Otera, and Hamana, 1983). 

Results of the analyses of variance for the scale values obtained by the law of 
comparative judgments from these three tests are indicated in Table 4.1. Accord- 
ing to the significance level, each factor influences the scale value of preference 
independently (Ando, 1985). 


Table 4.1. Analyses of variance for three tests, A, B, and C with 9—16 subjects. 





Degree 







Sum of 

of 

Mean 


Significance 

Contribution 

Test 

Factor 

squares 

freedom 

square 

F 

level 

[%] 

Music 
Motif A 

Arj (SD) 

0.20 

3 

0.07 

4.4 

< 0.05 

14 

A 

T iub 

0.73 

3 

0.24 

17 

< 0.01 

65 


Residual 

0.13 

9 

0.01 

— 




LL 

0.99 

3 

0.33 

48 

< 0.01 

27 

B 

IACC 

2.61 

2 

1.30 

187 

< 0.01 

71 


Residual 

0.04 

6 

0.01 

— 




^sub 

2.44 

3 

0.82 

68 

< 0.01 

89 

C 

IACC 

0.17 

3 

0.06 

5 

< 0.05 

5 


Residual 

0.11 

9 

0.01 

— 



Music 
Motif B 

A t\ (SD) 

1.20 

3 

0.40 

22 

< 0.01 

13 

A 

Tsub 

7.63 

3 

2.54 

141 

< 0.01 

84 


Residual 

0.16 

9 

0.02 

— 




LL 

0.74 

3 

0.25 

12 

< 0.01 

24 

B 

IACC 

1.90 

2 

0.95 

47 

< 0.01 

67 


Residual 

0.12 

6 

0.02 

— 




T su b 

2.55 

3 

0.85 

182 

< 0.01 

79 

C 

IACC 

0.64 

3 

0.21 

46 

< 0.01 

19 


Residual 

0.04 

9 

0.01 

— 




These results hold, even if scale values shifted in origin were applied in analyses. 
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Table 4.2. Examinations on independent effects of each two of four objective factors 
on the subjective preference judgments. 


Factors 

LL At, (SD) 

Tsub IACC 

LL 

— Ando and Okada* 

None Test B: Ando and 

Morioka (1981) 

At, (SD) 


Test A: Ando, Okura, Ando and Imamura 
and Yuasa (1982) (1979); Ando and 

Gottlob (1979) 

Tsub 


— Test C: Ando, Otera, 

and Hamana (1983) 


* Unpublished: The effects of Aq were examined under the fixed conditions of a great range of 
SL (Figure 4.6). 


Other Tests. In addition, as listed in Table 4.2, subjective preference judgments 
were performed in sound fields with multiple early reflections (Ando and Gottlob, 
1979), and with subsequent reverberations (Ando and Imamura, 1979). These 
results confirm that the factors (SD) and the IACC are independent of each other 
in the subjective preference judgments. 

As is discussed in Section 4.3.5, when the sensation level (SL) is weak enough, 
say 30 dB, the most preferred delay time of the reflection becomes longer than that 
at the preferred listening level around 80 dB. Thus, it may be concluded that the 
total scale value of subjective preference is determined by the law of superposition 
in the range of preferred conditions of the four factors tested. The consistency of 
the unit of the scale values obtained from the different preference tests has been 
discussed at length (Ando, 1985). 


4.3. Optimum Design Objectives 

According to such a systematic investigation of simulating sound fields in a concert 
hall by the aid of a computer and listening tests (paired-comparison tests), the 
optimum design objectives and the linear scale value of subjective preference 
may be described. The optimum design objectives can be described in terms of 
the subjectively preferred sound qualities, which are related to the temporal and 
spatial factors describing the sound signals arriving at the two ears. They clearly 
lead to comprehensive criteria for achieving the optimal design of concert halls as 
summarized below (Ando, 1983). 
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4.3.1. Listening Level 

The listening level is, of course, the primary criterion for listening to the sound 
field in concert halls. The preferred listening level depends upon the music and 
the particular passage being performed. For example, the gross preferred levels 
obtained by 16 subjects are in the peak ranges of 77 dBA to 79 dBA for Music 
Motif A (Royal Pavane by Gibbons) with a slow tempo, and 79 dBA to 80 dBA 
for Music Motif B (Sinfonietta by Arnold) with a fast tempo (see Figure 3.1). 


4.3.2. Early Reflections After the Direct Sound 

An approximate relationship for the most preferred delay time has been discovered 
in terms of the autocorrelation function of source signals and the total amplitude 
of reflections, A (Ando, 1985). Generally, it is expressed by 


[At,] /? = t p , (4.3) 

10/? O) I envelope ^ kA v at t = T p , (4.4) 

where k and c are constants depending on the subjective attributes. (For the im- 
portant subjective attributes, these constants and the range of values of A tested 
are listed in Table 6. 1 .) If the envelope of the ACF is exponential, then 


z r % (Iog |0 - - C logic A ) x 

where the total pressure amplitude of reflection is given by 


(4.5) 


A = \a\ + A\ + A] + ■ . (4.6) 

The relationship of Equation (4. 1 ) for a single reflection may be obtained by putting 
A = A i , k = 0. 1 , and c = 1 so that 


Tp = [At\] p - (1 - log ! 0 A\)z e . 


4.3.3. Subsequent Reverberation Time 
After the Early Reflections 

For flat frequency characteristics of reverberation (one of the preferred conditions), 
the preferred subsequent reverberation time is expressed approximately by 

[T sub ] p « 23r,. (4.7) 

The values A tested were 1 . 1 and 4. 1 , that cover the usual conditions of the sound 
field in a room. A lecture room and conference room must be designed for speech, 
and an opera house and similar theaters for vocal music. For orchestra music, these 
may be two or three types of concert-hall designs according to the effective duration 
of the ACF. For example, Symphony No. 41 by Mozart, “Le Sacre du Printemps” 
by Stravinsky, and Arnold’s Sinfonietta have short ACFs and fit orchestra music 
of type (A). On the other hand, Symphony No. 4 by Brahms and Symphony No. 7 
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by Buckner are typical of orchestra music (B). Much longer ACFs are typical for 
pipe organ music, for example, by Bach. 

The most preferred reverberation times estimated for each sound source are 
shown in Figure 7.7 for the selection of music motifs to be performed. Considering 
the fact that the value of x e is obtained at the ten percentile (or — 10 dB) delay of the 
envelope of the ACF of a source signal, the —60 dB delay time of the ACF-envelope 
corresponding roughly to the “reverberation time” containing the source signal 
itself, given by 6r e . Therefore, the most preferred reverberation time of the sound 
fields expressed by Equation (4.7) implies about four times the “reverberation 
time” contained in the source signal itself. Concerning the preferred frequency 
characteristics, this is discussed in Section 6.2.2. 


4.3.4. Dissimilarity of Signals in Both Ears (IACC) 

All the available data indicate a negative correlation between the magnitude of the 
IACC and the subjective preference, i.e., dissimilarity of signals arriving at the two 
ears is preferred. This holds only under the condition that the maximum value of the 
interaural cross-correlation function is maintained at the origin of the time delay. 
If not, then an image shift of the source may occur (Section 4.5). To obtain a small 
magnitude of IACC in the most effective manner, the directions from which the 
early reflections arrive at the listener should be kept within a certain range of angles 
from the median plane, i.e., ±(55° ± 20°). It is obvious that the sound arriving 
from the median plane ±0° makes the IACC greater. Sound arriving from ±90° in 
the horizontal plane is not always advantageous, because the similar “detour” paths 
around the head to both ears cannot decrease the IACC effectively, particularly for 
frequency ranges higher than 500 Hz. For example, the most effective angles for 
the frequency ranges of 1 kHz and 2 kHz are about ±55° and ±36°, respectively 
(Figure 6.3). 


4.3.5. Effects of Sensation Level on the Preferred Delay 
Time of Reflections 

The purpose of this section is to understand the effects of the sensation level on 
subjective preference judgments upon a change of temporal factors. As a typical 
example, the effects on the preferred At\ under different fixed sensation levels (SL) 
were examined (Ando and Okada, data). The background noise in the listening 
room was 17.5 dBA. The number of subjects was fifteen. The source signal used 
here was Music Motif B. But a different part of motif B (with (r e ) m \ n = 60 ms) 
from the other experiments described in this book was used, thus the calculated 
value of the most preferred delay time [ Ari is 60 ms, when A = 1 . 

The results of the scale value as a function of At\ are shown in Figure 4.6 for 
each fixed value of SL. The preferred values of At\ obtained here as a function of 
SL are shown in Figure 4.7 normalized to the calculated value [A/i]^ = 60 ms. Ob- 
viously, when SL = 80 dB, the most preferred value turns out to be the same as 
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Figure 4.6. Scale values as a function of the , for fixed values of the sensation level (SL). 
(A): SL = 80 dB; (>): SL = 55 dB; and (•): SL = 30 dB (Ando & Okada, unpublished). 


shown in Section 4.3.2. It is found that the most preferred value of At\ increases 
with decreasing value of SL. Since the SL is considered to be based on the internal 
biological noise, a similar tendency may result due to the signal to noise ratio in the 
physical sound field as well. It is quite natural that, if the noise level is increased, 
Afj and T su b take on larger values, making these temporal factors effective. 


4.4. Theory of Calculating Scale Values 
of Subjective Preference 

4.4.1. Theory of Subjective Preference 

Let us now put these results into practice. Since the number of orthogonal acoustic 
factors which are included in the sound signals at both ears are limited, as men- 
tioned in Section 3.4, the scale value of any one-dimensional subjective response 
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Figure 4.7. Normalized values of the preferred delay time as a function of SL, to that at 
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may be expressed by 


S = g(x i, * 2 , • • • , xj). (4.8) 

In this study, the linear scale value of preference obtained by the law of comparative 
judgment is described. It has been verified by a series of experiments that four 
objective factors act independently of the scale value; when changing two of the 
four factors simultaneously as indicated in Table 4.2. Results indicate that the units 
of scale values are almost constant (even if different music motifs are used (Ando, 
1985)), so that we may add scale values to obtain the total scale value (Ando, 
1983), 


S = g(x\) 4- g{x 2 ) 4- g(x 3 ) -f g(x 4 ) 


(4.9) 


where 5/, i = 1, 2, 3, 4, is the scale value obtained relative to each objective 
parameter. Equation (4.9) indicates a four-dimensional continuity. 

The dependence of the scale values on each objective parameter is shown graph- 
ically in Figure 4.8. From the nature of the scale value, it is convenient to put a 
zero value at the most preferred conditions, as shown in Figure 4.8. The results 
of the scale value of subjective preference obtained from the different test series, 
using different music programs, yield the following common formula: 

S, % —a, \xj | 3/2 , (■ = 1,2, 3,4, (4.10) 


where the values of a ,• are weighting coefficients as listed in Table 4.3. If «, is 
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(a) Listening level — [dB] 



Figure 4.8a, b. Scale values of the subjective preference obtained for the simulated sound 
field in an anechoic chamber. Different symbols indicate scale values obtained by different 
source signals (Ando, 1985). Even if different signals are used a consistency of scale values 
as a function of each factor is observed, fitting a single curve, (a) As a function of the 
listening level, LL. The most preferred listening level, [LL]^ = 0 dB; and (b) as a function 
of Ar,/[Ar,] /; . 
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Figure 4.8c, d. Scale values of the subjective preference obtained for the simulated sound 
field in the anechoic chamber. Different symbols indicate scale values obtained by different 
source signals (Ando, 1985). Even if different signals are used a certain consistency of scale 
values as a function of each factor is observed, fitting a single curve, (c) as a function of 
Tsub/ITsub]/?; and (d) as a function of the I ACC. The most preferred values [At] ] p and [T sub ] p 
are calculated by Equations (4.5) with k = 0.1 and c — 1 and (4.7), respectively. 
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Table 4.3. Objective parameters and coefficients a f obtained with 9-16 
subjects. 


i 

Xj 

<*i 

Xj >0 Xj < 0 

i 

20 log P - 20 log[P] p (dB) 

0.07 

0.04 

2 

log(Af|/[Af|] p ) 

1.42 

1.11 

3 

log(7' sub /[7' sub ] p ) 

0.45 + 0.75A 

2.36 - 0.42 A 

4 

IACC 

1.45 

— 


close to zero, then a lesser contribution of the factor x, on subjective preference is 
signified. 

The factor x\ is given by the sound pressure level difference, measured by the 
A -weighted network, so that 

x\ = 20 log P — 20 log[P] /7 , (4. 1 1) 

P and [P] p being the sound pressure at a specific seat, and the most preferred 
sound pressure that may be assumed at a particular seat position in the room under 
investigation; 


*2 = log 
*3 = log 


Af| 

[Ar,],, 

?~sub 

ITsub]/> 


X 4 = I ACC. 


(4.12) 

(4.13) 

(4.14) 


Thus, the scale values of preference have been formulated approximately in terms 
of the 3/2 power of the normalized objective parameters, expressed in the logarithm 
for the parameters, x \ 9 x 2 , and X3. The spatial binaural parameter x 4 is expressed 
in terms of the 3/2 power of its real values, indicating a greater contribution than 
those of the temporal parameters. Thus, the scales values are not greatly changed in 
the neighborhood of the most preferred conditions, but decrease rapidly outside of 
the ranges. Since the experiments were conducted to find the optimal conditions, 
this theory holds only in the ranges of the preferred conditions tested for the four 
factors. 


4.4.2. Calculation of the Subjective Preference 
at Each Seat 

As a typical example, we will discuss the quality of the sound field at each 
seat in a concert hall with a shape similar to that of the Symphony Hall in 
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Figure 4.9. An example of calculating scale values with the four factors using Equations 
(4.9H4.14). (a) Contour lines of the total scale value for Boston Symphony Hall with 
original side reflectors on the stage; and (b) contour lines of the total scale values for the 
side reflectors optimized. 


Boston. Suppose that a single source is located at the center, 1.2 m above the 
stage floor. Receiving points at a height of 1.1 m above the floor level cor- 
respond to the ear positions. Reflections with their amplitudes, delay times, 
and directions of arrival at the listeners are taken into account using the image 
method. 

Contour lines of the total scale value of preference calculated for Music Motif 
B are shown in Figure 4.9. This figure demonstrates the effects of the reflection 
from the side reflections on the stage. The side wall on the stage may produce 
decreasing values of IACC for the audience area. Thus, the preference value at 
each seat is increased, as is shown in Figure 4.9(b) in comparison with that in 
Figure 4.9(a). 

In this calculation, the reverberation time is assumed to be 1.8 s throughout 
the hall and the most preferred listening level, [LL] /; = 20 log [P] p in Equation 
(4.1 1), is set for a point on the center line 20 m from the source position. 
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4.5. Examination of Subjective Preference 
in an Existing Hall 

4.5.1. Preference Test in an Existing Hall 

The subjective preference judgments for different source locations on the stage 
were performed by the paired-comparison tests at each set of the seats. The rela- 
tionship between the resulting scale value of subjective preference and the physical 
factors obtained by simulation, using an architectural plan drawings, was exam- 
ined by factor analysis (Sato, Mori, and Ando, 1997). Calculated scale values of 
subjective preference were reconfirmed in an existing concert hall, the Uhara Hall 
in Kobe (Figure 4. 10). The physical factors at each set of seats for the four source 
locations on the stage (Figure 4.10(a)) were calculated. In the simulation, the 
directional characteristics of four loudspeakers used in preference tests were taken 
into consideration. The simulation calculation was performed up to three reflection 
times for each directional reflection n in Equation (3. 1 3) to a listener. Due to a floor 
structure with a fair amount of acoustic transparency, the floor reflection was not 
taken into account for the calculation, and part of the diffuser ceiling was regarded 
as a nonreflective plane for the sake of convenience. In the calculation of the IACC, 
the listeners faced toward the center of the stage, so that the IACC was not always 
a maximum at the interaural time delay r = 0. 

This hall contains 650 seats with a volume of 4870 m 3 . Loudspeakers were 
placed at 0.8 m above the stage floor, and 64 listeners divided into 21 groups 
were seated at the specified set of seats. Without moving from seat to seat and 
excluding the effects of other physical factors such as visual and tactual senses 
on judgments, subjective preference tests by the paired-comparison method were 
conducted, switching only the loudspeakers on the stage. As a source signal, Music 
Motif B was selected in the tests. Scale values of preference were obtained by 
applying the law of comparative judgment (Case V; Thurstone, 1927; Torgerson, 
1958) and were reconfirmed by the goodness of fit (Mosteller, 1951). In order to 
obtain enough data, a set of adjoining three or four seats was chosen in a single 
test session. This session was repeated five times, exchanging the sets of seats, and 
14 to 16 subjects in total were tested at each set of seats. 


4.5.2. Results of Multiple-Dimensional Analysis 

In order to examine the relationship between the scale values of subjective prefer- 
ence and physical factors obtained by simulation of an architectural scheme, the 
data were analyzed by the factor analysis described in Appendix I (Hayashi, 1952; 
1954a and b). 

Of the four physical factors, the reverberation time was almost constant for the 
source location and the seat location throughout the hall, and thus not involved in 
the analysis. As was previously discussed, as a condition of calculating the scale 
value of preference, and the maximum value of the interaural cross-correlation 




(b) 



(c) i I ill 11 

Figure 4. 10. Plan (a) and cross-sections (b) and (c) of the Uhara Hall, Kobe. Four source 
positions on the stage, which were switched in the paired-comparison tests of prefer- 
ence without moving subjects from seat to seat. Listening positions were 21 locations 
and included neighboring seats. Illustration of seats is shown only part of Figure 4.10(a). 
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function must be maintained at an interaural time delay r = 0 to ensure frontal 
localization of the sound source. However, the IACC was not always maintained 
at r — 0 due to the loudspeaker locations, because the subjects could not always 
be facing the source location. In this analysis, therefore, the effect of the interaural 
time delay of the IACC, i.e., r IAC c was added as an additional factor. Thus, the 
outside variable to be predicted with factor analysis was the scale value obtained 
by subjective judgments, and the explanatory factors were: 

(1) the listening level; 

(2) the initial time-delay gap; 

(3) the IACC; and 

(4) the interaural time delay of the IACC (tiacc)* 

The scores for each category of factors obtained from the factor analysis are 
shown in Figure 4. 1 1 . As shown in Figure 4. 1 1 (a), the scores of the listening level 
indicate a peak at the subcategory of 83 dB to 85.9 dB and decrease the score apart 
from the preferred listening level. For the IACC, the preference score increases 
with a decrease in the IACC (Figure 4.1 1(c)). It is worth noticing that the scores 
of the above-mentioned two factors are in good agreement with the preference 
scale values obtained by preference judgments for a simulated sound field. The 
scores of the initial time delay gap which are normalized to the optimum value 
(A t \ /[ A/-j ] p ) peaked at a smaller value than the most preferred value of the initial 
time-delay gap obtained from the simulated sound fields (Figure 4.11(b)). It is 
considered that, due to the limited range of the A/j in the existing concert hall and 
the limited data in the short range of the A C, the effects of the A t\ of the sound 
fields were rather minor in this investigation. Concerning tiacc (Figure 4.1 1(d)), 
the score decreases monotonically as the delay is increased. This may be caused 
by an image shift. 

The relationship between the scale value obtained by subjective judgments and 
the total score at each center of three or four seats is shown in Figure 4.12. The scale 
values of preference are well predicted with the total score for four loudspeaker 
locations (r = 0.70,/? < 0.01). On occasion there is a certain degree of coherence 
between physical factors, for example, the calculated listening level and the IACC 
for sound fields in existing concert halls. However, due to the fact that the factors are 
theoretically orthogonal, the preference scores obtained here are in good agreement 
with the calculated preference scales that are obtained by simulating sound fields. 
It is possible that such an apparent coherence may be eliminated without loss of 
any information, even if some data are excluded for the analyses. 

In this study, the subjective preference of source locations on the stage are 
examined at each set of seats. The rear source (#4) is more preferred than that 
of the other sources. The side source (#3) indicates low preference, due to the 
interaural time delay of the IACC. The initial time-delay gap has a small influence 
on the total score. 

This study introduces the effects of the interaural time delay of the IACC on the 
preference. If the IACC is obtained at a certain interaural time delay, then scores 
decrease rapidly. The results of the analyses described here demonstrate that the 
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Figure 4.11a, b. Scores for each category of four physical factors obtained by factor anal- 
ysis. The number signifies the partial correlation coefficient between the score and each 
factor, (a) Listening level; (b) normalized initial time-delay gap between the direct sound 
and the first reflection. 
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Figure 4. 1 lc, d. Scores for each category of four physical factors obtained by factor anal- 
ysis. The number signifies the partial correlation coefficient between the score and each 
factor, (c) IACC; and (d) interaural time delay of the IACC, 7j A cc> found as the most sig- 
nificant factor in this investigation with loudspeaker production on the stage. Tendencies 
obtained here are similar to those of the scale values shown in Figure 4.8, Section 4.4. 
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Figure 4.12. Relationship between the scale values obtained by paired-comparison tests 
in the existing hall and the total scores calculated by factor analysis using the scores shown 
in Figure 4.1 1. The correlation coefficient, r = 0.70(p < 0.001). 


theory of calculating subjective preference by the use of four physical parame- 
ters is supported only when the maximum value of interaural cross-correlation is 
maintained at r = 0. This condition is usually obtained in a real concert facing 
the visible performer. 
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Human Hearing System 


The first part of this chapter describes the sensitivity of the human ear to a sound 
source that is formed primarily by the physical system consisting of the external 
canal, eardrum, and bone chain with oval window. 

Next, in order to describe a background of subjective responses, the electrical- 
physiological responses of the auditory pathways and of the left- and right-cerebral 
hemispheres are analyzed. Several remarkable findings are offered in this chapter. 


5.1. Physical Systems of Human Ears 

5.1.1. Head, Pinna, and External Auditory Canal 

The acoustic environment is perceived by the ears, in which a sound signal is 
given by a time sequence. The three-dimensional space is also perceived by the 
ears, mainly because the head-related transfer functions Hy r (r\ro, co) between a 
source point and the two ear entrances have directional qualities from the shapes 
of the head and the pinna system. The directional information is contained in such 
head-related transfer functions, including the interaural time difference. 

Figure 5.1 shows examples of the amplitude of the head-related transfer function 
//(£, rj, co) as parameters of the angle of incidence §(77 = 0). These were measured 
by the single-pulse method at the far-field condition (Mehrgardt and Mellert, 1977). 
The angle £ = 0° corresponds to the frontal direction and £ = 90° corresponds 
to the lateral direction toward the side of the ear being examined. 

Since the diameter of the external canal is small enough compared with the 
wavelength (below 8 kHz), the transfer function of E (co) is independent of the 
directions in which sound is incident on the human head for the audio-frequency 
range: 


EtAS, V* <*>) ^ £ i.rM % E(co). 

Therefore, interaction between the sound field in the external canal and that of the 
outside, including the pinna, is insignificant. The transfer function from the free 
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0.2 0.5 1 2 5 10 15 

Frequency — **[kHz] 


Figure 5.1. Transfer functions (amplitude) from a free field to the ear-canal entrance as a 
parameter of the horizontal angle £ (Mehrgardt and Mellert, 1977). 


field to the eardrum can be obtained by multiplying together the following two 
functions: 

( 1 ) the sound field from the sound source in the free field to the ear-canal entrances, 
Hi r ( r], a))\ and 

(2) the sound field from the entrance to the eardrum, E(oo). 

Measured absolute values of E{a)) are shown in Figure 5.2, where the variations 
in the curves obtained by different investigations are caused mainly by the different 
definitions of the ear-canal entrance point. A typical example of the transfer func- 
tion from a sound source in front of the listener to the eardrum is shown in Figure 
5.3. This corresponds to direct sound when the listener is facing the performer. The 
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transfer functions obtained by these three reports (Wiener and Ross, 1946; Shaw, 
1975; Mehrgardt and Mellert, 1977) are not significantly different for frequencies 
up to 10 kHz. 


5.1.2. Eardrum and Bone Chain 

Behind the eardrum are the tympanic cavities containing the three auditory ossicles, 
the malleus, incus, and stapes. This area is called the middle ear (Figure 5.4). 



0.2 0.5 1 2 5 10 15 

Frequency — [kHz] 

Figure 5.2. Transfer functions of the ear canal. ( ) : from Wiener and Ross ( 1 946); 

( ) : from Shaw (1974); and ( ): from Mehrgardt and Mellert (1977). 



0.2 0.5 1 2 5 10 15 

Frequency — [kHz] 

Figure 5.3. Transfer functions from a sound source in front of the listener to the eardrum. 

( ) : From Wiener and Ross (1946); ( ) : from Shaw (1975); ( ): 

from Mehrgardt and Mellert (1977). 



5.1. Physical Systems of Human Ears 51 


Semicircular canal 



Auditory tube 


Figure 5.4. Schematic illustration of the human ear (modified from Dorland, 1947). 


The sound pressure striking the eardrum is transduced into vibration. The middle 
ear ossicles transmit the vibration to the cochlea. The vibration pattern of the 
human eardrum was first measured by Bekesy ( 1 960) by making a point-by-point 
examination with an electric capacitive probe. Later, Tonndorf and Khanna (1972) 
measured the vibration pattern by time-averaged holography, which allows finer 
vibration patterns on the eardrum to be perceived, as shown in Figure 5.5. Note 
that the outline of the malleus is visible in the pattern at the value of 3.5. The 
vibration on the malleus is transmitted to the incus and the stapes. The transfer 



Figure 5.5. Contour lines of equal amplitude of human eardrum vibration at 525 Hz (121 dB 
SPL). Each value should be multiplied by 10~ 5 cm (Tonndorf and Khanna, 1972). 
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function C (co) of the human middle ear between the sound pressure at the eardrum 
and the apparent sound pressure on the cochlea is plotted in Figure 5.6. The values 
have been rearranged by the author. Data were obtained by Onchi (1961) and 
Rubinstein et al. (1966) from cadavers. The maxima at 1 kHz are adjusted to the 
same value. Later, Puria, Rosowski, and Peake (1993) made measurements by a 
system that included a hydropressure transducer used in the vestibule as shown in 
Figure 5.7. The hydropressure transducer and the microphone with identical sound 
pressure stimuli in air produced estimates of pressure within 0.5 dB for the range 
of 50 Hz to 11 kHz. The results at the sound pressure levels of 106 dB, 1 12 dB, 
and 1 1 8 dB indicating similar values are shown in Figure 5.8. These results agree 
well with the data shown in Figure 5.6, as far as relative behavior is concerned. 
The transfer function measured at 124 dB showed some signs of nonlinearity, but 
below about 118 dB it was consistent with a linear system. The magnitude of 
the middle-ear pressure gain is about 20 dB in the frequency range 500 Hz to 2 
kHz. 

For the usual sound field, the transfer function between a sound source located 
in front of the listener and the cochlea may be represented by 

S(co) — //(0, 0, co)E(qj)C(q)). (5.1) 

The values are plotted in Figure 5.9 with data from Onchi (1961) and Rubinstein 
et al. (1966). The pattern of the transfer function agrees with the ear sensitivity for 
people with normal hearing ability, so that the ear sensitivity can be characterized 
primarily by the transfer function from the free field to the cochlea (Zwislocki, 
1976). Better agreement can be obtained with the values reexamined in the low- 
frequency range (Berger, 1981). 
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Figure 5.6. Transfer function (relative amplitude) of the human middle ear between the 
sound pressure at the eardrum and the apparent pressure on the cochlea. (•): Average value 
measured (modified from Onchi, 1961); (o): Measured value (Rubinstein et al., 1966). 
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Figure 5.7. Measurement system of middle-ear transfer function (Puria, Rosowski, and 
Peake, 1993). To measure the inner-ear pressure, a hydropressure transducer was placed in 
the vestibule facing the stapes. In order to ensure that the cochlea remains fluid filled during 
the measurement, an inlet flush tube was cemented into the upper semicircular canal and 
an outlet flush tube was cemented into the apical turn of the cochlea. 


5.1.3. The Cochlea 

The stapes is the last bone of the three auditory ossicles, and is the smallest bone 
of the human body. It is connected with the oval window, and drives the fluid in 
the cochlea, producing a traveling wave along the basilar membrane. The cochlea 
contains the sensory receptor organ on the basilar membrane, which transforms 
the fluid vibration into the neural code, Figure 5.10. The basilar membrane is 
so flexible that each section can move independently of the neighboring section. 
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Frequency [kHz] 


Figure 5.8. Transfer function of the human middle ear between the sound pressure at the 
eardrum and the inner-ear pressure (Puria, Rosowski, and Peake, 1993). The global behavior 
is surprisingly similar to that of Figure 5.6. 
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Figure 5.9. Sensitivity of the human ear to a sound source in front of the listeners. 

( ): Normal hearing threshold (ISO recommendation); ( ): reexamined 

in the low-frequency range (Berger, 1981); («,o): transformation characteristics between 
the sound source and the cochlea, S(co) = H (co) E (co)C (co)', (•): data obtained from mea- 
sured values C(co) by Onchi (1961); and (o): from Rubinstein (1966), which are combined 
with the transfer function H (co) E (co) as measured by Mehrgardt and Mellert (1977). 
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Figure 5.10. Cross-section through the cochlea showing the fluid filled canals and the 
basilar membrane supporting hair cells (modified from Rasmussen, 1943). 


The traveling waves on the basilar membrane observed by Bekesy (1960), Figure 
5.1 1 (a, b), are consistent with this representation. 


5.1.4. The Nervous System 

The nervous system is one of the most important parts in the whole acoustic system. 
Obviously, there are some deep connections between sound signals, responses of 
the auditory system and brain, and subjective attributes, and will be covered in 
Sections 5.2 and 5.3. 

The mechanical information in the traveling waves on the basilar membrane 
is transduced into biological information. The transducers, consisting of about 
15,000 receptors on the basilar membrane, are specialized nerve cells called hair 
cells. The action potentials from the hair cells are conducted and transmitted to a 
higher level in the brain. The frequency response curve, called the “tuning curve” 
of a single fiber, were first systematically demonstrated in the auditory pathway 
by Katsuki and his group (1958). The results of the threshold response in the 
potential activity of the cochlear nerve of a cat are shown in Figure 5.12(a), and 
of the trapezoid body in Figure 5. 12(b). The important phenomenon is the so-called 
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Distance from stapes — [mm] 

Figure 5.11. (a) Envelope of the traveling wave; and (b) the traveling waves on the basilar 
membrane at 200 Hz (Bekesy, 1960). 


sharpening effect. The tuning curve becomes sharper than the resonance curve on 
the basilar membrane. This tendency becomes more distinct at higher levels and the 
slope reaches the order of 1 000 dB/octave. Bekesy ( 1 967) explained this as a result 
of a lateral inhibition action of neural networks. Interactions between neighboring 
neurons are responsible, at least partially, for the sharpening. Therefore, responses 
of a single pure tone co tend to approach a limited region in the auditory pathway 
x'. Accordingly, the input power density spectrum of the cochlea I (co) can be 
roughly mapped at the nerve position x\ so that the spectrum can be written as 
I (x f ). This neural activity appears capable of attaining the autocorrelation function 
as described in Section 3.1. 

In addition to the cochlea nuclei, there are the superior olivary complex, the 
lateral lemniscus nuclei, the inferior colliculus, and the medial geniculate body. 
Neural signals are processed at every relay station. Since several interaural cross 
connections are known to exist as physiological structures (e.g.. Pickles, 1982), it 
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Figure 5.12. Frequency response functions of single fibers as threshold responses in the 
potential activity of a cat’s auditory system. Each line indicates the response of different 
single fibers (Katsuki, Sumi, Uchiyama, and Watanabe, 1958). (a) Cochlea nerve; and 
(b) trapezoid body. 
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is quite possible that there exists an interaural cross-correlation mechanism at the 
inferior colliculus as discussed below. Also, in the following section, the results 
of some experiments with records of electro-physiological responses from the 
auditory pathways and the left and right cerebral hemispheres will be described. 


5.2. Influence of Electro-Physiological Responses 
from Auditory Pathways and Human Cerebral 
Hemispheres Relating to Subjective Preference 

5.2.1. Auditory Brainstem Responses (ABR) 

A possible mechanism has been assumed for the IACC in the auditory pathways 
in judging subjective preference and subjective diffuseness. The left and right 
auditory— brainstem responses (ABR) were recorded in order to justify such a 
mechanism for the spatial information that might exist in the auditory pathways 
(Ando, Yamamoto, Nagamatsu, and Kang, 1991). 

(A) ABR Recording and Flow of Neural Signals 

As a source signal p(t), a short-pulse signal (50 /zs) was supplied to a loudspeaker 
with frequency characteristics of ±3 dB for 100 Hz to 10 kHz. This signal was 
repeated every 100 ms for 200 s (2000 times), and left and right ABRs were 
recorded through electrodes placed on the vertex, and left and right mastoids. The 
distance | r — ro \ between the loudspeakers and the center of the head was kept at 
68cm ± 1 cm. The loudspeakers were located on the right-hand side of the subject. 

Examples of recording ABR as a parameter of the horizontal angle of sound 
incidence of one of four subjects are shown in Figure 5.13. It is seen that waves 
I— VI from the vertex and the right mastoid differ in amplitude as indicated by each 
curve. Quite similar ABR data for the four subjects who participated were obtained, 
and data for the four subjects (23 ± 2 years of age, male) were averaged. As shown 
in Figure 5. 14(a) (wave I), of particular interest is the fact that amplitudes from the 
right which may correspond to the sound pressure from the source located at the 
right-hand side are greater than those from the left, r > / for£ = 30° — 150°, (p < 
0.01). This tendency is reversed in wave II as shown in Figure 5.14(b), l > r for 
£ — 60° and 90°, p < 0.05. The behavior of wave III shown in Figure 5.14(c) 
is similar to that in wave I, r > / for £ = 30° — 150°, p < 0.01. This tendency 
is again reversed in wave IV as shown in Figure 5.14(d), / > r for £ = 60° and 
90 c , p < 0.05, and this is maintained further in wave VI as shown in Figure 5. 14(f) 
even though absolute values are amplified, / > r for £ = 60° and 90°, p < 0.05. 
From this evidence, it is likely that the flow of neural signals is interchanged 
three times between the cochlear nucleus, the superior olivary complex, and the 
lateral lemniscus, as shown in Figure 5.15 for this spatial information process. The 
interchanges at the inferior colliculus may be operative for the interaural signal 
processing as discussed below. 
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Figure 5.13. Examples of the auditory brainstem response (ABR) obtained between the 
vertex and left- and right-mastoids, as a parameter of the horizontal angle of sound incidence. 
The abscissa is the time relative to the time when the single pulse arrives at the right ear 
entrance. Arrows indicate the time delay, depending upon the sound source location of the 
right-hand side of the subject, and the null amplitude of the ABR. 


In wave V, as shown in Figure 5.14(e), such a reversal cannot be seen, and the 
relative behavior of amplitudes of the left and the right are parallel and similar. 
Thus, these two amplitudes were averaged and plotted in Figure 5.18 (symbols V). 
In this figure, the amplitudes of wave IV (left and right, symbols / and r) are also 
plotted, in reference to the ABR amplitudes at frontal sound incidence. 

Concerning latencies of waves I through VI relative to the time when the short 
pulse was supplied to the loudspeaker, the behaviors indicating relatively short 
latencies in the range around £ = 90° were similar (Figure 5.16). It is remarkable 
that a significant difference is achieved ( p < 0.01) between averaged latencies at 
§ = 90°, and those at § = 0° (or $ = 180°), i.e., a difference of about 640 /zs on 
average, which corresponds to the interaural time difference of sound incident at 
£ = 90°. It is most likely that the relative latency at wave III may be reflected by 
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Figure 5. 1 4a, b. Averaged amplitudes of the ABRs for four subjects, waves I— VI. The size 
of circles indicated the number of available data from four subjects. (•): Left ABRs; (o): 
Right ABRs. (a) Wave I and (b) Wave II. 
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Figure 5. 1 4c, d. Averaged amplitudes of the ABRs for four subjects, waves I— VI. The size 
of circles indicated the number of available data from four subjects. (•): Left ABRs; (o): 
Right ABRs. (c) Wave III and (d) Wave IV. 
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(f) 5 

Figure 5. 14e, f. Averaged amplitudes of the ABRs for four subjects, waves I— VI. The size 
of circles indicated the number of available data from four subjects. (•): Left ABRs; (o): 
Right ABRs. (e) Wave V and (f) Wave VI. 
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ABR WAVES 

Flow of signals I II III IV V VI 



Traveling 

wave 


Figure 5.15. Schematic illustration of the flow of signals in auditory pathways. EC : external 
canal; ED and BC: eardrum and bone chain; BM and HC: basilar membrane and hair cell; 
CN: cochlear nucleus; SOC: superior olivary complex; LLN: lateral lemniscus nucleus; IC: 
inferior colliculus; MGB: medial denticulate body; AC: auditory cortex of the right and left 
hemispheres. The source location of each wave (ABR) was previously investigated for both 
animal and human subjects (Jewett, 1970; Lev and Sohmer, 1972; Buchwald and Huang, 
1975). 


the interaural time difference. No significant differences could be seen between 
the latencies of the left and right of waves I— IV, as indicated in Figure 5.16. 

(B) ABR Amplitudes in Relation to the IACC 

Figure 5.17 shows values of the magnitude of interaural cross-correlation and the 
autocorrelation functions at the time origin. These were measured at the two ear 
entrances of a dummy head as a function of the horizontal angle after passing 
through the A-weighting networks. The averaged amplitudes of wave IV (left 
and right) and averaged amplitudes of wave V that were both normalized to the 
amplitudes at the frontal incidence (£ = 0°) are shown in Figure 5.18. Even with a 
lack of data at £ =0°, similar results could be obtained when the amplitudes were 
normalized to those at § = 180°. Although we cannot make a direct comparison 
between the results in Figures 5.17 and 5.18, it is interesting to point out that the 
relative behavior of wave IV(1) in Figure 5.18 is similar to <t> rr (0) in Figure 5.17 
which was measured at the right-ear entrance r. Also, the relative behavior of 
wave IV,. is similar to <£//( 0) at the left-ear entrance /. In fact, the amplitudes of 
wave IV (left and right) are proportional to <t> YV (0) (jc = r, /, respectively), due 
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Figure 5.16. Averaged latencies of the ABRs for four subjects, waves I-VI. Sizes of the 
circle indicate the number of available data from four subjects. The latency of 2 ms indicated 
by two corresponds to the distance between the loudspeaker and the center of the head. (•): 
left ABRs; (o): right ABRs. 


to the interchange of signal flow. The behavior of wave V is similar to that of the 
maximum value, |d>/ r (T)| max , |r| < 1 ms. Since correlations have the dimensions 
of the power of the sound signals, i.e., the order of A 2 , the I ACC defined by 
Equation (3.23) may correspond to 


~ A T’ 

|/MV,r Aiv./J 

where Ay is the amplitude of the wave V, which may be reflected by the “max- 
imum” neural activity (Ay % |<I>/ r (r) | max ) at the inferior colliculus (see Figure 
5.15). Also, A\ V r and Ai V ,/ are amplitudes of wave IV on the right and left, 
respectively. The results obtained by Equation (5.2) are plotted in Figure 5. 1 9. It is 
clear that the relative behaviors of the I ACC and P are in good agreement, except 
for the value of P at £ = 1 50° at which only a single datum for A \ was obtained 
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Figure 5.17. Correlations of sound signals at the left- and right-ear entrances of a dummy 
head. L: <J>//(0) measured at the left ear; R: d> rr (0) measured at the right ear; and <t>: 
Maximum interaural cross-correlation, | d> /r (r) | max , |r | < 1ms. 



Figure 5.18. Averaged amplitudes of waves IV/ (symbol: 1) and IV r (symbol: r), and 
averaged amplitudes of waves V/ and V,- (symbol: V) normalized to the amplitudes at the 
frontal incidence (four subjects). 
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Figure 5.19. Values of the IACC and P calculated by Equations (3.23) and (5.2), respec- 
tively. A linear relationship between the IACC and the value P is obtained ( p < 0.005). 
Note that available datum at f = 150° was a single subject. 


with only a single subject. The values exceeding unity are caused by the error in 
the measurements. Obviously, a high correlation between the values of the IACC 
and P is achieved, i.e., 0.92 (p < 0.005). 

(C) Remarks 

The amplitudes of the ABR clearly differ according to the horizontal angle of the 
incidence of sound to the listener, as shown in Figure 5. 14. In particular, it is found 
that the amplitudes of waves IV/ and IV r are nearly proportional to the sound 
pressures at the right- and left-ear entrances, respectively, when the amplitude is 
normalized to that at £ =0° or 180°. 

As far as the left- and right-amplitude behaviors of the ABR recorded here 
are concerned, the first interchange of the neural signal is considered to occur 
at the entrances of the cochlear nucleus, the second interchange may take place 
at the superior olivary complex, and the third may be at the lateral lemniscus 
nucleus, as shown in Figure 5.15. Thompson and Thompson (1988), who used 
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neuroanatomical tract-tracing methods in guinea pigs, found four separate path- 
ways connecting one cochlea either with the other cochlea or with itself all via 
brainstem neurons. This may relate to the first interchange at the entrance of the 
cochlear nucleus. 

As has been discussed the “maximum,” of the neural activity for wave V (in- 
ferior colliculus) in the auditory pathways corresponds to the IACC appearing in 
the amplitude of the ABR around 8.5 ms after the sound signal supplied to the 
loudspeakers (68 cm ± 1 cm). 

The latency of wave V decreases with increasing sensation level as shown in 
Figure 5.20 (Hecox and Galambos, 1974). This implies binaural summation for the 
sound energy or the sound pressure level which may be reflected in both <£// (0) and 
(0) corresponding to Equation (3.24). As will be discussed below, the sound 
level response indicates right hemisphere dominance. Also, the relative latency at 
wave III corresponds to the interaural time difference (Figure 5.16). 

As described below, it is remarkable that there is a linear relationship between 
the IACC and the A 2 -latency observed in the slow vertex response (SVR) over 
both cerebral hemispheres ( p < 0.025) as shown in Figure 5.21 (see also Figure 
5.26, right). Thus, the subjective preference and subjective diffuseness judgments 
of the sound field, described previously in relation to the IACC, are well based on 
the activities of the auditory— brain system. 


[msl 



Figure 5.20. Latency to Wave V as a function of the sensation level (Hecox and Galambos, 
1974). This may correspond to a binaural summation of sound energies from the left and 
right ears. (See also the Pi and N\ latencies of the SVR as shown in Figure 5.26.) 
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Figure 5.21. The linear relationship between the IACC and the M-latency, p < 0.01 
(Ando, Kang, and Nagamatsu, 1987). (•): N 2 -latency of SVR over the left hemisphere; (o): 
N 2 -latency of SVR over the right hemisphere; and ( ): regression. 


5.2.2. Slow Vertex Responses (SVRs) 

Previously, four significant and independent physical factors have been discussed 
consisting of time and space criteria of the sound field in a concert hall. Efforts to 
describe the important qualities of sound in terms of the processes of the auditory 
pathways and the brain have been brought to bear on the problem. If enough were 
known about how the auditory and the central nervous systems modify the nerve 
impulses from the cochlea, the design of concert halls, for example, could proceed 
according to guidelines derived from the knowledge of these processes. Attempts 
to approach this have been made through a study of the auditory evoked potentials 
over the left and right human cerebral hemispheres. 

(A) Recording Slow Vertex Responses as 
Auditory Evoked Potential (AEP) 

Prior to the test, each subject was asked to abstain from smoking and from drinking 
any kind of alcoholic beverage for about 12 hours. In order to compare the results 
of the SVRs with the subjective preference obtained by paired-comparison tests, 
a reference stimulus was first presented and then the adjustable test stimuli were 
presented. Such pairs of stimuli were presented alternately 50 times and the SVRs 
were recorded. The electrical responses were obtained from the left and right 
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temporal area ( T 3 and T4) according to the International 10-20 System (Jasper, 
1958). The reference electrodes were located on the right and left earlobes and 
were connected together. Figure 5.22 shows examples of the SVR amplitude (as 
an AEP amplitude), obtained by averaging the 50 responses for a single subject, 
as a function of the delay time of a single reflection. The amplitude of the reflection 
was the same as that of the direct sound A 0 = A\ = 1 , and the source signal was 
a fragment of a continuous speech (Japanese) “ZOKI— BAYASHI” (meaning a 
grove or a copse) of 0.9 s. The reference sound field was only the direct sound, 
without any time delay, and the total sound pressure levels were kept constant 
in this experiment. Two loudspeakers producing the direct sound and the single 
reflection were located together in front of the subject, so that the magnitude of 
the interaural cross-correlation (IACC) could be kept at a constant value of nearly 
unity for all sound fields tested here. 

From Figure 5.22, we can find the maximum latency at the most preferred delay 
time of reflection to be 25 ms, indicating a relaxation. This delay time corresponds 
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Figure 5.22. Averaged SVRs recorded for a single subject. Dotted lines are the loci of 
P 2 latency for the delay time of the reflection. The upward direction indicates negativity, 
(a) left hemisphere; and (b) right hemisphere. 
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to the effective duration of the autocorrelation function (ACF) of the continuous 
speech signal (Ando, Kang, and Morita, 1987). 

(B) Hemispheric Dominance Depending on 
the Spatial and Temporal Factors 

Figure 5.23 shows amplitudes of the early SVR, A{P\ — N \ ), as a function of the 
delay time of reflection. The values, averaged for eight normal subjects, are plotted 
in this figure. The source signal was continuous speech. The solid line indicates 
the amplitude from the left hemisphere and the dashed line is the amplitude from 
the right hemisphere. Obviously, the amplitude from the left is greater than that 
from the right ( p < 0.01). This may indicate the left hemisphere dominance or 
specialization of the human brain for such a change of the delay time of reflection 
for speech (see also Table 5.1). When the sensation level was changed, the am- 
plitude of the SVR from the right hemisphere was greater than that from the left 
hemisphere except for that at 30 dB of the SL, even if the speech signal was used 
as shown in Figure 5.24 (Nagamatsu, Kasai, and Ando, 1989). When the IACC 
was changed in the paired stimuli, using 1/3 -octave-band noise with the center 
frequency of 500 Hz, then the amplitude from the right hemisphere was much 
greater than that of the left as shown in Figure 5.25 (Ando, Kang, and Nagamatsu, 
1 987). When we put all these data in together, Table 5. 1 shows that the hemispheric 
dominance changes for different sound signals and a change to one of the acoustic 
factors of the sound fields. It is remarkable that hemispheric dominance appeared 
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Figure 5.23. Averaged amplitudes A (P\ — N \ ) of the test sound field over the left and right 

hemispheres, as a function of the delay time of the reflection (eight subjects). ( ): 

left hemisphere; and ( ): right hemisphere. 
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Table 5.1. Amplitude differences of the SVR, A(P\ — N\) over the left 
and right cerebral hemispheres. See also Table 5.4. 


Source signal 

Parameter varied 

A(P\ - 

■N\) 

Significance level 

Speech (0.9 s) 

SL 

R > 

L 

< 0.01 

Speech (0.9 s) 

Ar, 

L > 

R 

< 0.01 

Speech vowel /a/ 

IACC 

R > 

L 

< 0.025 

1/3 Oct. band noise 

IACC 

R > 

L 

< 0.05 


on only the amplitude component of the SVR. It is found here that the right hemi- 
sphere was dominant for “the continuous speech” signal under the condition of 
varying the SL, while the left hemisphere was dominant under the condition of 
varying the delay time of reflection, which is a temporal criterion of the sound 
field. If the IACC was changed, then the right hemisphere was highly activated 
due to the spatial criterion and noise stimulus. 

As is well known, the left hemisphere is mainly associated with speech and time- 
sequential identifications, and the right is concerned with nonverbal and spatial 
identifications (Kimura, 1973; Sperry, 1974). It is considered here, however, that 
hemispheric dominance is a relative phenomenon depending on what is changed in 



Figure 5.24. Averaged amplitudes, A (Pi — N i) of the test sound field over the left and 

right hemispheres, as a function of the sensation level (five subjects). ( ): left 

hemisphere; and ( ): right hemisphere. 
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Figure 5.25. Averaged amplitudes, A{P\ — N\ ) of the test sound field over the left and 

right hemispheres, as a function of the IACC (eight subjects). ( ): left hemisphere; 

( ) : right hemisphere. 


the comparasion pair, i.e., the temporal criterion or spatial criteria and no absolute 
behavior could be observed. 

(C) Relationship Between the A^-Latency and Subjective Preference 

Figure 5.26 summarizes the relationship between the scale values of subjective 
preference (subjective diffuseness in the change of the IACC) and the latency 
components of the SVR, while the behavior of the amplitude component indicated 
hemispheric dominance as discussed above. Applying the paired method of stimuli, 
both the SVR and the subjective preference for sound fields were investigated 
as functions of the sensation level and the time delay of the single reflection. 
As is mentioned above, the source signal was continuous speech with a 0.9 s 
duration. 

The results of the scale value of subjective preference are indicated in the up- 
per part of Figure 5.26, while the lower part indicates the appearance of latency 
components. As shown in the left and center columns in this figure, the neural 
information related to subjective preference appeared typically in an N 2 -latency 
of 250 ms to 300 ms, when the SL and the delay time of the reflection At\ were 
changed. 

Further details of the latencies for both the test sound field and the reference 
sound field, when A/j was changed, are shown in Figure 5.27. Interestingly, the 
parallel latencies at P 2 , N 2 , and P 3 , were clearly observed as functions of the delay 



5.2. Influence of Electro-Physiological Responses from Auditory Pathways 73 





aoijajajajd jo AS [siu] — Aou 0 }e~| 3 

Figure 5.26. Relationships between averaged latencies of SVR and subjective preference 
for three objective parameters. ( ): Left hemisphere; ( ) : Right hemi- 

sphere. (a) As a function of the sensation level (SL); (b) as a function of the delay time of 
reflection, A t \ ; and (c) as a function of the IACC. 
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Figure 5.27. Averaged latencies for both the test sound field and the reference sound field 

for paired stimuli, as a function of the delay time of the reflection, A t \ . ( ): left 

hemisphere; ( ) : right hemisphere. Maximum latencies of P 2 , N 2 , and Pi, are 

found at At\ = 25 ms, while relatively short latencies of P'2, AT 2, and P'3 are observed. 


time At \ . However, latencies for the reference sound field ( Af] = 0) in the paired 
stimuli are found to be relatively shorter, while the latencies for the test sound field 
(At\ = 25 ms, the most preferred delay) become longest. This may indicate a kind 
of contrast process underestimating the reference sound field when it is compared 
with a preferred sound field. 

In general, relatively long-latency responses are observed in the subjectively 
preferred range of each factor. Thus, the difference of N 2 -latencies over both hemi- 
spheres in response to a pair of sound fields contains almost the same information 
obtained from paired-comparison tests for preference, as does primitive subjective 
response. 
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The right column of Figure 5.26 shows the effects of varying the IACC using 
1/3 -octave-band noise (500 Hz). At the upper part, the scale value of the subjective 
diffuseness is indicated as a function of the IACC. The scale value of the subjective 
preference also has a similar behavior plotted against the IACC, when speech or 
music signals are presented as described in the previous section. The information 
related to subjective diffuseness, therefore, appears in the A^-latency, ranging from 
260 ms to 3 10 ms, in which a tendency for an increasing latency while decreasing 
the IACC was observed for eight subjects (except for the left hemisphere of one 
subject). As already indicated in Figure 5.21, the relationship between the IACC 
and the V 2 -latency was found to be linear and the correlation coefficient between 
them was —0.99 (p < 0.01). 

Furthermore, let us look at the behavior of the early latencies of P\ and N\. 
These were almost constant when the delay time and the IACC were changed. 
However, the information related to the sensation level and loudness may be found 
typically at the N\ -latency. This tendency agrees well with the result of Botte, 
Bujas, and Chocholle (1975). 

Consequently from 40 ms to 170 ms of the SVR, the hemispheric dominance 
may be found for the amplitude component, which may be called specialization 
of the left and right hemispheres. Then the latency differences corresponding to 
the sensation level may be found in the range of 120 ms to 170 ms. Finally, it 
is found that the A 2 -latency components in the delay range between 200 ms and 
3 10 ms may well correspond to the subjective preference relative to the listening 
level, the time delay of the reflection, and indirectly the IACC. Since the longest 
latency was always observed at the most preferred condition, it is concluded that 
the larger part of the brain may be relaxed at the preferred condition, causing the 
observed latency behavior to occur. 

As discussed in Section 5.2.1, the activity of the ABR in the short delay range 
(less than 10 ms) after the sound signal has arrived at the eardrums, may indicate 
a possible mechanism in the auditory pathways for detecting the magnitude of the 
IACC. 


5.2.3. SVR in Change to Both the Initial Time Delay of 
Reflections and the IACC 

The scale values of sound fields estimated in a series of subjective preference 
judgments, conducted for the purpose of designing concert halls and auditoria, 
have been well described by four independent factors, including both temporal 
and spatial ones. The auditory-evoked potentials over the left and right human 
cerebral hemispheres were analyzed in an effort to model auditory— brain systems 
for responding to sound fields, and a possible mechanism of the interaural cross- 
correlation function at the inferior colliculus in the auditory pathways was found 
by recording the left and right auditory brainstem responses. Analyses of the slow 
vertex responses (SVR) revealed that the A 2 -latency in the delay range between 
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200 ms and 310 ms corresponds well to the subjective preference with reference 
to the listening levels, the initial time delay of the reflection, and the IACC. So 
far, results of the SVR in the change of each single factor of sound have been 
described. 

Now, we have to examine whether or not the initial time delay of reflections 
as a temporal factor and the IACC as a spatial factor have influence on the N 2 - 
latency independently (Ando and Mizuno, data). The experimental subjects were 
two right-handed Japanese students, subjects W and U . The SVR was recorded 
from the left and right temporal areas, respectively, and labeled as 7 3 and T 4 . 
A pair of sound fields consisting of the reference stimulus (Ar t = 0 ms and 
IACC = 0.99: the direct sound was reproduced by a loudspeaker in front of 
the subject) and the test stimulus was presented. The sound-pressure level mea- 
sured at the ear entrance was held constant at a peak value of 75 dB(A). The 
sound signal used was a male voice of 0.9 s duration reading the words of the 
Japanese poem, “ZOUKI— BAYASHI.” The silent interval between the sound sig- 
nals was 1.1s. The effective duration x e of the autocorrelation function (ACF) of 
the speech signal was 35 ms for the direct sound reproduced by the loudspeaker 
used here. 

To evaluate the independence of the two factors by their influence on the N 2 - 
latency, the initial time-delay gap between the direct sound and the first reflection 
of the test stimuli was adjusted to be 

At\ = 25, 35, and 70 [ms], 

and the IACC of the sound field was controlled by interchanging the two directional 
reflections (Ando and Mizuno, data), so that 

IACC = 0.27, 0.49, and 0.59. 

The amplitudes of the two reflections were both — 3 dB with respect to that of the 
direct sound, and the delay time between the two reflections was 5 ms making 
incoherent reflections. Thus, a total of nine sound fields was used as test stimuli, 
and the test was repeated five times for each subject. 

An example of the averaged N 2 -latencies of the SVR, from the left and right 
hemispheres of subject W is shown in Figure 5.28 as a function of At\, for three 
IACC values. Similar to the results of the previous sections, significant changes 
due to the two factors are obtained only in the N 2 -latencies. Here, A 2 -latencies for 
subject W depended on both At] ( p < 0.01) and IACC (p < 0.05) from both 
hemispheres, and there is no significant interaction between these factors on the N 2 - 
latency. This study confirms the independent influence of both factors, such that the 
longest N 2 -latencies are obtained when A t\ — r e (35 ms when the total amplitude 
of reflections A = 1 .0), which corresponds to the most preferred condition, and the 
latencies are longer when the IACC are smaller. No significant differences could 
be achieved between the left and right hemispheres in this experiment, at least 
due to the changes of temporal and spatial criteria associated with the different 
hemispheres. 
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Figure 5.28. Latencies of the SVR from the left hemisphere as a function of the initial time 
delay At] and as a parameter of the IACC of sound fields for a single subject (Ando and 
Mizuno, data). 


The A^-latency values for subject U also indicated the maximum at At\ = 35 ms 
( p < 0.01), but there was no significant differences between IACC = 0.27 and 
0.59. There are probably two causes for this result: 

(1) the small range of the IACC varied, between 0.27 and 0.59; and 

(2) the individual differences between subjects, similar to the difference of 
subjective preference judgments as discussed in Chapter 9. 


5.3. Influence of the Continuous Brain Wave (CBW) on 
Subjective Preference 

Thus far, we have discussed results obtained by adding the auditory evoked poten- 
tials (SVR) up to 500 ms in the change of the SL, A t\ , and the IACC, using short 
signals less than 0.9 s. However, for a wide range of reverberation times (T su b), no 
useful data could be obtained by the SVR. 

The purpose of this section is to find a distinctive feature in the continuous 
brain wave (CBW) for the T su b with a long signal duration. Before going into 
detail, a preliminary study was performed in the change of the delay time of a 
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single reflection, to reconfirm the SVR results as discussed in Section 5.2 (Ando 
and Chen, 1996). 


5.3.1. CBW in the Change of the Delay Time 
of a Single Reflection 

In this experiment, Music Motif B (Arnold: Sinfonietta of Opus 48, a 5 s piece 
of the third movement) was selected as the sound source. The delay time of the 
single reflection An was alternatively adjusted to 35 ms (a preferred condition) 
and 245 ms (a condition of echo disturbance). The CBW of ten pairs from 73 and 
74 was recorded for about 140 s in one day, and the experiments were repeated 
over a total of three days with 1 1 subjects (male students, 24 ± 2 years of age). 
In an anechoic chamber, the subject was asked to close his eyes to concentrate 
on listening to the music during recording of the CBW. The IACC was kept at a 
constant value of nearly unity. Two loudspeakers in front of the subject were set 
up. The sound-pressure level was fixed at 70 dBA, in which the amplitude of the 
single reflection was the same as that of the direct sound, Aq = A] = 1. The 
leading edge of each sound signal was recorded at the same time for analyses of 
the CBW. The CBW recorded was sampled at least 100 Hz after passing through 
a filter of 5 —40 Hz with a slope of 140 dB/Oct. 

In order to analyze a possible brain activity corresponding to the subjective 
preference, an attempt was made to analyze the effective duration of the ACF, r e in 
the a-wave range (8 Hz to 13 Hz) of the CBW. First of all, considering the fact that 
the subjective preference judgment needs at least 2 s to develop a “psychological 
present,” the running integration interval (27) was examined by changing between 
1.0 s and 4.0 s. A satisfactory duration 27 in the ACF analyses was found only 
from the left hemisphere for 2—3 s, but not from the right. 

Table 5.2 indicates the results of the analysis of variance for values of r e in the ex- 
wave obtained at 2T = 2.5 s. Though the individual difference is great (p < 0.01), 
a significant difference is obtained for At\ (LR: p < 0.025); however, a significant 
difference is also observed for an interference effect between factors A t\ and LR 
( p < 0.01). Therefore, in order to analyze the data in more detail for each At\ 
and LR, we show the averaged value of x e in the a-wave with 1 1 subjects in Figure 
5.29. It is clearly shown that the values of i e at A t\ = 35 ms are significantly 
longer than those at A t\ — 245 ms (p < 0.01) only on the left hemisphere, but 
not on the right. Ratios of z e values in the a-wave range Arj = 35 ms to 245 ms 
for each subject are shown in Figure 5.30. All of the individual data indicate that 
the ratios in the left hemisphere are much longer than those in the right hemisphere 
at the preferred condition of 35 ms. 

Thus, the results reconfirm that, when A t\ is changed, the left hemisphere is 
greatly activated (Figure 5.23), and the value of r e on the a -wave on this hemisphere 
corresponds well with the subjective preference. The a-wave, which has the longest 
period in the CBW in the awakening stage and which may indicate a fullness of 
“pleasantness” and “comfortableness” or a preferred condition, is widely accepted. 
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Figure 5.29. Averaged value of x e in the a-brain-wave range for a change of Aq : 35 ms 
and 245 ms (eleven subjects). Left: left hemisphere; and Right: right hemisphere. 


Thus, a large value of x e in the a-wave may relate to the large iV 2 -latency of the 
SVR at the preferred condition discussed in Section 5.2. 


5.3.2. CBW in the Change of the Subsequent 
Reverberation Time 

Now, let us examine values of x e in the a -wave for a change of the subsequent 
reverberation time (T su b) with 10 subjects relative to the scale values of subjective 
preference (Chen and Ando, 1996). The sound source used was Music Motif B, 
the same as above. The CBWs from the left and right hemisphere were recorded. 
Values of z e in the a -wave for the duration 2 T = 2.5 s were also analyzed here. 


Table 5.2. Results of the analysis of variance 
for values of x e of the ACF in the a -wave, in 
change of A t\. 


Factors 

F 

Significance level 

Subject 

93.1 

< 0.01 

Hemisphere, LR 

1.0 


Delay time, At\ 

5.8 

< 0.025 

Subject and LR 

8.9 

< 0.01 

Subject and At) 

0.4 


LR and At\ 

9.6 

< 0.01 

Subject, LR and At\ 

0.4 



Value T e in a -wave [msj 
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Figure 530 . Ratio of the r. e values in the a -brain- wave range for a change of A/j ; [r { , value 

at 35 ms]/[r f value at 245 ms] for each of eleven subjects, A K. Left: left hemisphere; 

Right: right hemisphere. 



0.2 1.2 0.2 1.2 1.2 6.4 1.2 6.4 


Tsub [s] 


Figure 531. Averaged value of z e in the a-brain-wave range for a change of 7" sub : 0.2 s and 
1 .2 s; 1.2 s and 6.4 s (ten subjects). Left: left hemisphere; Right: right hemisphere. 
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First consider the averaged values of x e in the a -wave as shown in Figure 
5.31. Clearly, the values of x e are much larger at T su b = 1.2 s (a preferred condi- 
tion) than those at T su b = 0.2 s in the left hemisphere, while the values of x e are 
larger at T sub = 1 .2 s (a preferred condition) than those at T sub = 6.4 s also in the 
left hemisphere. However, these facts are not true for the right hemisphere; rather, 
the contrary is true. 

The results of the analysis of variance are indicated in Table 5.3. Although there 
is a large individual difference, a significant difference is achieved for T sub in the 
pair 0.2 s and 1.2 s (p < 0.05), and interference effects are observed for the factors 
Subject and LR ( p < 0.01), and LR and T sub (p < 0.01). No such significant 
differences are achieved for the pair 1.2 s and 6.4 s, but there are interference 
effects between the Subject and LR, and the Subject and T sub . Thus, in order to 
discuss the matter in more detail, the ratio of values of x e in the a -wave are shown 
in Figure 5.32 for each subject. All of the individual data indicate that the ratios in 
the left hemisphere are much larger than those in the right hemisphere at T sub = 
1.2 s relative to T sub = 0.2 s (Figure 5.32(a)). However, this is not the case of 
T sub = 1.2 s relative to 7 sub = 6.4 s, indicating large individual differences 
(Figure 5.32(b)). 

In fact, these individual results correspond well to the scale-values of individ- 
ual subjective preference. Figure 5.33 shows the scale values of preference as a 
function of T sub for each subject. The most preferred values of T sub , which were 
different for each subject, averaged at about 1.2 s. The ratio of the values x e in 
the a -wave at 1.2 s and 6.4 s is well correlated to the difference of the scale- 
values of the subjective preference of each individual, reflecting a large individual 
difference, as shown in Figure 5.34 ( r — 0.70, p < 0.01). 

The CBW in change of the IACC has been investigated (Nishio and Ando, 
1996). Clearly, in change of the IACC using Music Motif B, the right hemisphere 
dominance is activated by the analyses of the value of x e in the a -wave ( p < 
0.001). Then this activity propagates toward the left hemisphere. Such an activity 
is derived from calculation of the cross-correlation function between the CBWs 
derived from the different electrodes. 

Table 5.4 summarizes the hemisphere dominance obtained by analyses of the 
values of x e in the a -wave on a change of At\ , 7 sub , and the IACC. This conclusion 
may suggest that the value of x e in the a-wave is an objective index for obtaining 
excellent conditions of human environment so far as the brain is concerned (Sec- 
tion 12.1). Note that no preference information appeared in the amplitude 0(0) of 
the running ACF of a -waves. 


5.4. Auditory— Brain System: A Proposed Model 
5.4.1. Background 

This model is based on the following facts: First of all, we are interested in the 
fact that the human ear sensitivity to the sound source in front of the listener is 
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Figure 5.32. Ratio of z e values in the a-brain-wave range for a change of r su b for each of 
ten subjects, (a) — (j). (a) [z e value at 1.2 s]/[r e value at 0.2 s]; and (b) [z c value at 1.2 s]/[r c , 
value at 6.4 s]. Left: left hemisphere; Right: right hemisphere. 


essentially formed by the physical system from the source point to the oval window 
of the cochlea as discussed in Section 5.1. 

By recording the left and right ABRs it has been found that: 

(1) Amplitudes of waves I and III correspond roughly to the sound-pressure level 
as a function of the horizontal angle of incidence to the listener (£). 


Scale value of preference 
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Table 5.3. Results of the analysis of variance for values of x e of the 
ACF of the a -wave in change of r sub . 




Significance 


Significance 

Factor 

F 

level 

F 

level 


(Pair of 


(Pair of 



0.2 s& 


1.2 s & 



1.2 s) 


6.4 s) 


Subject 

40.9 

< 0.01 

40.2 

< 0.01 

LR 

2.1 


2.0 


Tsub 

6.2 

< 0.025 

0.02 


Subject and LR 

2.8 

< 0.01 

2.0 

< 0.05 

Subject and r sub 

1.2 


2.7 

< 0.01 

LR and T sub 

14.0 

< 0.01 

0.2 


Subject, LR and T sub 
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Figure 5.33. Scale values of the subjective preference as a function of r sub obtained by the 
paired-comparison tests for each of ten subjects, (a) — (j). Scale values at 6.4 are extrapolated 
by the curve of 3/2 power of log 10 (T sub ), according to the results shown in Figure 4.8. 
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Figure 5.34. Relationship between the difference of scale values [SV (1.2 s) to SV (6.4 s)] 
and the ratio of x e values in the a -brain- wave range of the left hemisphere for a change in 
r sub for each of ten subjects, (a)-(j). 


(2) Amplitudes of waves II and IV correspond roughly to the sound-pressure level 
as a function of the contra horizontal angle (— $), implying the interchange of 
a neural information flow between the left and right hemispheres. 

(3) Results of analyses of the ABRs indicate that possible neural activities at the 
inferior colliculus correspond well to the values of the IACC. 

Also, it has been discovered by recording the left and right SVRs that: 

(4) The left and right amplitudes of the early SVRs, A(P\ — N\), indicate that 
the left and right hemispheric dominances are due to the temporal factor, At ] , 
and the spatial factors, LL and IACC, respectively, as indicated in Table 5.1. 


Table 5.4. Hemisphere dominance obtained by analyses of the values of r e in 
the o' -wave in change of A^i , T su b and the IACC. See also Table 5. 1 . 


Source signal 

Parameter varied 

Ratio of values of x e 
for a -waves 

Significance level 

Music Motif B 

At\ 

L > R 

p < 0.01 

Music Motif B 

Tsub 

L > R 

p < 0.01 

Music Motif B 

IACC 

R > L 

p < 0.01 
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(5) Both the left and right latencies of correspond well to the scale values of 
subjective preference as a primitive response. 

In addition to the above mentioned facts: 

(6) Results of the CBW for the cerebral hemispheric specialization of the temporal 
factors, Af] and 7 su b indicate the left hemisphere dominance, and the IACC in- 
dicates the right hemisphere dominance. Thus, a high degree of independence 
between the left and right hemispheric factors can be achieved. 

(7) Values of x e in the a-waves from dominant hemispheres correspond to the 
scale value of subjective preference. 

(8) The results of subjective preference, coloration, and other important subjective 
attributes are well described in relation to both the autocorrelation function of 
source signals and the interaural cross-correlation function. 

It is worth noticing that the SL or LL is classified as a temporal-monaural factor 
in the sense of the physical viewpoint. However, the results of the SVR indicate 
that the SL is the right hemisphere dominance (Section 5.2.2). Thus, hereafter, the 
SL or LL is classified as a spatial factor, which is also expressed by the geometric 
average value of sound energies arriving at the two ears, given by Equation (3.24). 

Based on these physiological responses, a model of the auditory— brain system 
may be proposed for the major independent acoustic factors, classified by compre- 
hensive temporal and spatial factors, which are well represented in the model. The 
model consists of the autocorrelation mechanisms, the interaural cross-correlation 
mechanism between the two auditory pathways, and the specialization of human 
cerebral hemispheres for temporal and spatial factors of the sound field. 


5.4.2. The Model 

According to the relationship of subjective attributes, and the phenomena to the 
auditory-evoked potentials, including the CBW in the change of acoustic factors, a 
model can be proposed as shown in Figure 5.35. In this figure, a sound source p(t) 
is located at r 0 in a three-dimensional space and a listener sitting at r is defined 
by the location of the center of the head, hi r (r |ro, f), being the impulse responses 
between ro and the left and right ear-canal entrances. The impulse responses of 
the external ear canal and the bone chain are e^ r (t) and c/, r (r ), respectively. The 
velocities of the basilar membrane are expressed by V/. r (x , co ) , x being the position 
along the membrane. 

The action potentials from the hair cells are conducted and transmitted to the 
cochlear nuclei, the superior olivary complex including the medial superior olive, 
the lateral superior olive and the trapezoid body, and to the higher level of two 
cerebral hemispheres as shown in Figure 5.35. 

According to the tuning of a single nerve fiber (Katsuki et al., 1958; Kiang, 
1965), the input power density spectrum of the cochlea I(x') can roughly be 
mapped at a certain nerve position x ' . This fact may be partially supported by the 
ABR waves (I— IV) which reflect the sound-pressure levels as a function of the 
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Figure 5.35. An auditory-brain model for subjective responses. 
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horizontal angle of incidence to a listener (Section 5.2.1). Such neural activities, 
in turn, include sufficient information to attain the ACF at a higher level, probably 
near the lateral lemniscus, as indicated by d>//(cr) and <& rr (o). The interchange of 
neural signals discussed in Section 5.2. 1 is not included here for convenience. As is 
also discussed in Section 5.2. 1 , the neural activity (wave V) may correspond to the 
I ACC. Thus, the interaural cross-correlation mechanism may exist at the inferior 
colliculus. It is concluded that the output signal of the interaural cross-correlation 
mechanism including the IACC and the loci of maxima may be dominantly con- 
nected to the right hemisphere. Also, the sound-pressure level, which corresponds 
to the denominator of Equation (3.23) with the ACFs for the two ears at the origin 
of time (a = 0) which, in fact appears in the latency at the inferior colliculus, 
as shown in Figure 5.20, may be processed in the right hemisphere. Effects of 
the initial time-delay gap between the direct sound and the single reflection Ati 
included in the autocorrelation function may activate the left hemisphere. The spe- 
cialization of the human cerebral hemisphere may relate to the highly independent 
contribution between the spatial and temporal criteria to subjective attributes. It 
is remarkable that, for example, “cocktail party effects” might well be explained 
by such specialization of the human brain, because speech is processed in the left 
hemisphere, and spatial information is mainly processed in the right hemisphere. 


5.4.3. Subjective Responses from the Model 

So-called “over-all responses ” such as subjective preference, must be associated 
with both hemispheres and with all four acoustic factors of the temporal and spatial 
variables. As discussed in Section 6.4, speech intelligibility and clarity may be 
influenced by all four factors including the IACC (Nakajima and Ando, 1991; 
Nakajima, 1992) associated with both human cerebral hemispheres. In general, 
any subjective attributes is described by all the acoustic factors associated with 
both cerebral hemispheres. 

For example, the apparent source width (ASW) is considered to depend on the 
IACC and the listening level (LL), as previously reported by Keet (1968) (see also 
Section 6.1). However, the subjective diffuseness may to some extent be related to 
At\ and T sub under the fixed conditions of the IACC and LL. Also, echo disturbance, 
the threshold of perception of a reflection and coloration (temporal information) 
are considered to be dominantly processed in the left hemisphere related to the 
envelope function of the ACF of the source signals and At\ (Ando, 1986), but 
they may depend more or less on the IACC as well. Furthermore, the IACC as 
a subjective diffuseness factor is dependent on Afi for a very short delay range 
0 < At) < O.lr^, increasing the value of the IACC (Ando, 1977). 

Any other subjective responses may well be described in such a manner by at 
least four orthogonal factors, together with the ACF of the source signal and the 
total amplitude of reflections, similar to the analysis of subjective preference, as 
is discussed in the following chapter. 
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Important Subjective Attributes for 
the Sound Field, Based on the Model 


Based on the model of the previous chapter, we can describe qualities of sound 
fields in terms of processes of the auditory pathways and the brain. The power 
density spectra in the neural activities in the left and right auditory pathways 
have a sharpening effect (Katsuki et al., 1958; Kiang, 1965). This information is 
enough to attain approximately the autocorrelation functions 3>//(a) and <£> rr (cr), 
respectively, where o corresponds to the neural activities. Together with the mecha- 
nism of the interaural cross-correlation function found in Section 5.2, fundamental 
subjective attributes may then be well described. 


6.1. Subjective Diffuseness and ASW in Relation to the 
IACC and/or the Wj ACC 

The interaural cross-correlation function is a significant factor in determining the 
perceived horizontal direction of a sound and the degree of subjective diffuseness 
of a sound field (Damaske and Ando, 1972). A well-defined direction is perceived 
when the normalized interaural cross-correlation function has one sharp maximum 
(a small value of VFiacc defined in Figure 3.7). On the other hand, subjective 
diffuseness or no spatial impression corresponds to a low value of the IACC (< 
0. 1 5). The subjective diffuseness or spatial impression of the sound field in a room 
is one of the fundamental attributes in describing good acoustics. If the sound 
arriving at the two ears are dissimilar (IACC = 0), then the different signals 
(but containing the same information) are conveyed through two channels of the 
auditory system to the brain. This condition, in turn, improves speech clarity as is 
discussed in Section 6.4. 

In order to obtain the scale value of subjective diffuseness, paired-comparison 
tests with bandpass Gaussian noise, varying the horizontal angle of two symmetric 
reflections, have been conducted (Ando and Kurihara, 1986; Singh, Kurihara, and 
Ando, 1994). Listeners judged which of two sound fields were perceived as more 
diffuse. A remarkable finding is that the scale values of subjective diffuseness are 
inversely proportional to the IACC, and may be formulated in terms of the 3/2 
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power of the IACC in a manner similar to the subjective preference values (see 
Section 4.3), i.e., 


S % —a (IACC/, 


( 6 . 1 ) 


whereof = 2.9, = 3/2. 

The results of scale values by the paired-comparison test, and the calculated 
values by Equation (6. 1 ) as a function of the IACC, are shown in Figure 6. 1 . There 
is a great variation of data in the range of IACC < 0.5; however, no essential 
difference may be found in the results with frequencies between 250 Hz and 4 
kHz. The scale values of subjective diffuseness, which depend on the horizontal 
angle, are shown in Figure 6.2, for 1/3 octave-bandpass noise with the center 
frequencies of 250 Hz, 500 Hz, 1 kHz, 2 kHz, and 4 kHz. Obviously, the most 
effective horizontal angles of reflections are different depending on the frequency 
range, and are inversely related to the behavior of the IACC values. These are 
about ±90° for the 500 Hz range and the frequency range below 500 Hz, around 
±55° for the 1 kHz range, and smaller than for the 2 kHz and 4 kHz ranges (Figure 
6.3). The control of directional reflections, for each frequency range, by means of 
wall-surface structures, is described in Chapter 8. 



IACC 


Figure 6. 1 . Scale values of subjective diffuseness as a function of the IACC calculated. 
Different symbols indicate different frequencies of the 1/3 octave bandpass noise: (A): 

250 Hz, (o): 500 Hz, (□): 1 kHz, (•): 2 kHz, (■); and 4 kHz. ( ): Regression line 

by Equation (6.1). 
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Figure 6.3. Effective horizontal angles to a listener to bring about a maximum decrease in 
the I ACC and an increase in subjective diffuseness for the frequency band, (o): Calculated 
(IACC); and (•): observed (IACC and subjective diffuseness, Figure 6.2). 


For a sound field with a predominantly low-frequency range (below 250 Hz), 
the interaural cross-correlation function has no sharp peaks for the delay range of 
|t | < 1 ms, and Wiacc becomes wider, as is shown in Figure 6.4(a, b). The values 
shown in Figure 6.4(b) are calculated theoretically by Equation {3.29) with, for 
simplicity, <5 = 0.1 and IACC = 1 . It is worth noticing that the value of <5 may be 
expressed as a function of the IACC. 

Of particular interest is that a wider ASW may be perceived within the low- 
frequency bands and by decreasing the IACC, as indicated by the equal- ASW 
curves (Hidaka, Beranek, and Okano, 1995). More clearly, the ASW may well 
be described by both factors, Wi A cc and IACC (Sato and Ando, 1996, in which 
the Wiacc defined by the time width of the interaural cross-correlation function 
crossing zero, for practical convenience; see also Sato and Ando, 1997). Such 
a perception becomes much more significant, if listeners close their eyes or are 
listening to loudspeaker reproduction. However, the ASW becomes a minor ef- 
fect in a real concert hall due to visual perception of the location of the sound 
sources. 


< 

Figure 6.2. Scale-values of subjective diffuseness and the IACC as a function of the hor- 
izontal angle of incidence to a listener, with 1/3 octave band noise of center frequencies, 
(a) 250 Hz; (b) 500 Hz; (c) 1 kHz; (d) 2 kHz; and (e) 4 kHz. 
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Figure 6.4. (a) Measured interaural cross-correlation functions for 1/3 octave-bandpass 
noise with center frequencies of 125 Hz to 2 kHz; and (b) W\ AC c as a function of the center 
frequency. 
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6.2. Subjective Attributes of the Sound Fields with 
a Single Reflection in Relation to the ACF of the 
Source Signals 


6.2.1. Preferred Delay of a Single Reflection for Listeners 

Results of the subjective preference tests, as an overall psychological response for 
sound fields with a single reflection, indicated that the most preferred delay of the 
reflection may be found by the envelope curve of the ACF, defined by the delay t p 
as Equations (4.3) and (4.4), 

[Af !] /7 = Tp , 

such that 

10 (?) I envelope = kA\, at T — T p , (6.2) 

where A\ is the pressure amplitude of the single reflection, k — 0. 1 and c — 1. If 
the envelope of the ACF is exponential, then the above equation is simply expressed 
by Equation 4.1. 

Such a relationship also holds for other important subjective responses in relation 
to the temporal factors discussed below. The constants k and c used in calculating 
important subjective responses to the sound field, based on the ACF of the source 
signals, and the amplitude ranges in the experiments, are listed in Table 6.1. 


6.2.2 . Preferred Frequency Characteristics of 
Reverberation Time 

This section discusses the preferred condition for the reverberation time beyond 
the treatment of Section 4.3.3. First, in order to obtain a flat frequency response 
below 3 kHz of the source signals, the right channel of the original recorded signal 
(Burd, 1969) was mixed with the left channel signal as indicated by the symbol of 
(L + R) for the Music Motifs B and C (Table 3.1). In controlling the frequency 
characteristics and eliminating other factors’ effects on preference judgments, the 
following five conditions were imposed (Ando, Okano, and Takezoe, 1989): 

( 1 ) the total amplitude of reflection A was adjusted to a constant 4.0; 

(2) the listening level was adjusted to 72 dBA or 74 dBA, according to the Music 
Motif; 

(3) the IACC was fixed at nearly 1.0 by the location of the loudspeakers on the 
median plane; 

(4) the delay time of early reflections was fixed at A t n = 18.0, 31.1, and 38.4 
[ms], n — 1, 2, 3, respectively; and 

(5) in order to keep independent reverberations for higher and lower frequency 
ranges separated by 500 Hz, the time difference between these two signals, 
produced by the Schroeder Reverberator (modified), was set at 5 ms. 



Table 6.1. Constants related to the ACF-envelope of source signals for calculating various subjective 
responses to sound fields with a single reflection, in relation to the ACF-envelope. 
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In addition, in order to obtain natural and colorless reverberation, signal- 
processing filters were set with the following values: 

(a) delay times of six comb filters used in the digital reverberator were fixed at 
the appropriate values for the music motif, i.e., r,- = 28.6, 32.4, 35.3, 37.9, 
40.5, 45.8 [ms] i = 1,2,... , 6, for Music Motif B(L + R ), and r, = 38.1, 
43.1, 47.1, 50.5, 53.9, 61.0 [ms] for Music Motif C(L + R)\ and 

(b) to eliminate the effects of coloration, the delay times of two of the all-pass 
filters were adjusted within O.lr^. 

The results of the scale value of preference are shown in Figure 6.5. In this 
figure, closed circles indicate values of the preferred reverberation time calculated 
by Equation (4.7) in Section 4.3, with the values (r e ) m \ n obtained by the running 
ACF, 2 T = 2 s, and the running interval of 100 ms through the A-weighting 
network. The closed squares indicate values calculated by the same equation, with 
values of (r^) m j n obtained for two frequency ranges above and below 500 Hz, 
respectively, after passing through the A-weighting network. 

The closed circles are closer to the preferred conditions for both types of music 
than the closed squares. The reverberation time for the high-frequency range is 
much more critical than that of the low-frequency range. The range of acceptable 
values for reverberation time in the low-frequency range is quite wide, from 0.5 to 
2.0 times that of the calculated preferred reverberation time (closed circles). Thus, 
flat frequency responses are in the range of the preferred condition. 


6.2.3. Coloration of a Single Reflection 

When we listen to sound very near a boundary wall in a room, coloration is clearly 
perceived. Here, such coloration is discussed in relation to the ACF-envelope (Ando 
andAlrutz, 1982). 

As a source signal, continuous bandpass noise was used, because the ACF of 
the signal is independent of the time interval extracted for subjective judgments 
and is theoretically calculable. The normalized ACF of the Gaussian noise after 
passing through an ideal bandpass filter with a flat response between frequencies 
f\ and / 2 , f 2 > f\ , is given by 


0(r) = 


sin 


Acot 


cos 


A(jO c T 


A cor \ 2 J \ 2 

where A co = 2i r(/ 2 — f\) and A co c — 2 n(f 2 + f\). 
The envelope of the ACF is expressed by 

2 


Acot 


. ( Acot \ 

sin ) 

V 2 J 


for 0 < Acot < n 


and 


(6.3) 


Acot 


for Acot > tt. 


(6.4) 
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Figure 6.6 shows an example of the measured ACF of the bandpass noise re- 
produced by a loudspeaker in an anechoic chamber and its envelope curve, which 
is calculated by putting the equivalent bandwidth for A co into Equation (6.4). Since 
nonideal characteristics of the filter and the loudspeaker were used, the equivalent 
bandwidth had to be chosen to be greater than the difference of the cut-off frequen- 
cies in Equation (6.4). A loudspeaker arrangement that presents the primary sound 
and the delayed sound is shown on the right side of Figure 6.7. The total sound 



Figure 6.6. Measured and calculated ACF of the bandpass noise with a center frequency 

of 1 kHz. ( ): Measured ACF; ( ): absolute values of the measured ACF; and 

( ): envelope curve by Equation (6.4). 


Figure 6.5. Contour lines of equal scale value of the preference for sound fields with two 
early reflections (fixed) and the subsequent reverberation times, versus T sub (< 500 Hz) and 
T su b(> 500 Hz). (•): Calculated preferred values at (r e ) mm for overall frequency range as 
indicated in Table 3.1; and (■): calculated preferred values at (r t ,) m j n for the two frequency 
ranges, (a) Music Motif B(L + /?), SPL = 72 dBA (12 subjects); and (b) Music Motif 
C(L + /?), SPL = 74 dBA (6 subjects). 
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pressure of the two sounds was automatically kept constant at the listener’s posi- 
tion in the anechoic chamber (60 dBA). The subject adjusted the sound-pressure 
level of the weak delayed sound and judged the threshold of the just noticeable 
difference which appeared as a coloration, in comparison to the situation in which 
only the primary sound was presented. 

The threshold levels of the weak sound relative to the primary sound are shown 
in Figure 6.7 as a function of the delay time Af j , for a noise source with a center 
frequency of 1 kHz. The dashed curve represents the calculated values obtained 
by Equation (6.2) with the envelope of the ACF (Figure 6.6) and using the derived 
constants, i.e., k — 10~ 25 and c = —2.0. Similar results were obtained with 
250 Hz and 4 kHz, even if the direction of weak sound was changed to £ = 36° 
or 90° (Ando and Alrutz, 1982). 


6.2.4. Threshold of Perception of a Single Reflection 

Seraphim (1961) investigated the perceptibility (aWs) of a single reflection with 
speech sound as shown in Figure 6.8. The aWs data were obtained under the 
condition of a single reflection with a horizontal angle to the listeners of£ = 30°. 
Unfortunately, the ACF of the speech signal used at that time could not be directly 
related to the behavior of the aWs. However, it was assumed that the ACF of any 
continuous speech signals with normal speed of speech does not differ much. Let 
us apply a typical ACF-envelope function of a speech signal as shown in Figure 
6.9. Then, the aWs may be described approximately with the ACF-envelope as 
indicated in the lower part of Figure 6. 10. 



SA(£=0-t? = 9- 
U/ZX /fe r? A0 T , = 97 


W 


(£=0°,7? = 27°) 



Figure 6.7. Threshold level of a weak sound W as a function of the delay time A/[ for 
a bandpass noise with a center frequency of 1 kHz. Different symbols indicate responses 
with two subjects. ( ): calculated values by Equation (6.2) with k = 10 -2 5 , c = —2. 
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Figure 6.8. Threshold level of a single reflection as a function of the delay time At] for 
continuous speech signal (Seraphim, 1961). 



Figure 6.9. ACF-envelope function of a typical continuous speech signal. 
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Figure 6.10. Amplitude of a single reflection obtained at several subjective responses in 

relation to the ACF-envelope of music and speech signals. ( ): Most preferred 

condition for the listener calculated by Equation (6.2) with k — 0.1, c = 1; ( ): 

threshold of perception of the reflection calculated by Equation (6.2) with k = 2, c = 1; 

( ): 50% echo disturbance calculated by Equation (6.2) with k — 0.01, c — 4 for 0 > 

A) > —6.0 [dB]; (A): aWs by the Beurtleilungsverfahre (Seraphim, 1961) with the ACF- 
envelope as shown in Figure 6.9; (•): Threshold of perception by the limit method (Morimoto 
et al., 1982) with the ACF-envelope in Figure 6.9.; and (o,«): 50% echo disturbances, after 
Haas (1951) and Ando et al. (1974), respectively, with the ACF-envelope in Figure 6.9. 

In order to confirm this result, the values of the perception threshold obtained 
by the limit method (Morimoto et al., 1982) are plotted in Figure 6.10, where the 
continuous speech signal with the ACF-envelope shown in Figure 6.9 was also 
used. Similar results were obtained in spite of the different speech signals used. 
Such a threshold of perception may be described by Equation (6.2) with k = 2 
and c = 1 . 

6.2.5 . Echo Disturbance of a Single Reflection 

In a manner similar to that mentioned above, the echo disturbance data by Haas 
(1951) and Ando, Shidara, and Maekawa (1974, Stereo-System in Fig. 2), may 
be rearranged with the ACF-envelope. Results of the 50% echo disturbance are 
shown in the upper part of Figure 6.10. 



6.3. Loudness in Relation to the Effective Duration of the ACF 
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Since the echo disturbance effects in the short-delay range of the single reflection 
within 50 ms are unclear, only the constants in Equation (6.2), for the delay range 
longer than 50 ms (10 log |0(r)| envelope < —20 dB), are meaningful in obtaining 
k — 0.01 andc = 4.0. The subjective preferences for listeners are located between 
the curves of the threshold (aWs) and of the echo disturbance. 


6.2.6. Preferred Delay of a Single Reflection 
for a Performer 

From preference judgments with respect to the ease of music performance by alto- 
recorder soloists (Nakayama, 1984), the most preferred delay time of the single 
reflection may also be described by Equation (6.2). In this case, the coefficients 
are k — 2/3 and c — 1/4. 

The coefficient k differs by a factor of about seven from that of the listeners. 
This indicates that the amplitude of the reflection is evaluated about seven times 
greater than in the case of listeners, namely, the “missing reflection for performers.” 
Weaker amplitudes are effective and preferred for musicians rather than those 
preferred by listeners. 

Some fundamental subjective attributes have been discussed in relation to the 
ACF. When the ACF-envelope is expressed approximately by an exponential, then 
[as is expressed by Equation (4.5)], the corresponding amplitudes of reflection may 
be described by the normalized delay time of reflection At\ and by the effective 
duration of the ACF z e , as shown in Figure 6. 1 1 . For listeners, if At\ / z e = 1, then 
20 log A i = 0 dB. In this figure, the musician’s preference of the reflection as 
ease of performance for playing an alto-recorder is also plotted. 


6.3. Loudness in Relation to the Effective Duration 
of the ACF 

Under the fixed conditions of the sound-pressure level (74 dB) and of the other 
temporal and spatial factors, loudness judgments were performed by changing 
the ACF of the bandpass noise within the critical band (Merthayasa and Ando, 
1 996). The effective duration of the ACF, r e , or the repetitive feature, of the band- 
pass noise of 1 kHz center frequency is controlled by the bandpass filter slope 
used (48, 140 and these in combination by use of two digital recorders, obtain- 
ing 1080 dB/octave). The duration of stimuli was 3 s with a rise and fall time of 
250 ms, and the interval between stimuli was 1 s. In this test, the bandwidth A F 
of stimuli defined by the — 3 dB attenuation of the low and high cut-off frequency 
was kept constant. In fact, A F was set at “0 Hz” with only the slope components 
controlling the wide range of r e (= 3.5-52.6 ms). 

The paired-comparison method for judgments was applied to more than six 
students of normal hearing ability. Since the subjects sat in an anechoic chamber 
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Figure 6.11. Amplitude of a single reflection obtained at several subjective responses 
including the preference of alto-recorder performers as a function of the normalized delay 
time Afi (see Figure 7.3) by the value of x e of music and speech. When the ACF-envelope is 
exponential, these values may be calculated approximately by Equation (4.5) with constants, 
k and c (Table 6.1). 


facing the loudspeaker located in front at 90 ± 1 cm, the IACC was kept constant 
at nearly unity. 

The scale values of loudness as a function of log x e are shown in Figure 
6.12 (Merthayasa, Hemmi, and Ando, 1994). Obviously, the loudness is influ- 
enced by the increasing value of r e . Statistical analysis with different values 
of r e indicates a significant difference in loudness ( p < 0.01). It can be 
demonstrated that the degree of the repetitive feature of stimuli contributes to 
the loudness. Since there is no significant difference in loudness between the 
pure tone and the "0 Hz” bandwidth signal, produced by use of a filter of 
1080 dB/octave slopes, use is recommended of a sharp slope-filter in the hear- 
ing experiment, which corresponds to the sharpness of the filter in the auditory 
system (Section 5.1.4). 

Furthermore, as shown in Figure 6. 1 3, it is found that the loudness of the band- 
pass noise within the critical band is not constant. Rather a minimum is indicated at 
a certain bandwidth, when the filter slope of 1080 dB/octave is used. This evidence 
differs from the results of Zwicker et al. (1957). 

A similar tendency is observed in that, as the reverberation increases, the value 
of r e also increases, as is shown in Figure 6. 14(a). Accordingly, loudness increases 
as the reverberation time increases as shown in Figure 6.14(b) (Ono and Ando, 
1996). 




Figure 6.12. Scale-value of loudness as a function of log r e , in which values of x e are 
measured in ms. 
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Figure 6.13. Scale-values of loudness of bandpass-noise as a function of its bandwidth 
centered on 1 kHz. The cutoff slope of the filter used was 1 080 dB/octave. Different symbols 
indicate the scale-values obtained with different subjects (six subjects). 
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(b) 



(a) Tsub [s] 



Figure 6.14. (a) Effective duration of the ACF of the sound field as a function of the 
subsequent reverberation time; and (b) the average value of loudness obtained by the constant 
method as a function of the reverberation time of the sound field. 
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It is worth noticing that the loudness does not depend on the IACC under the con- 
dition of a fixed sound-pressure level at both ear entrances (Merthayasa, Ando, and 
Nagatani, 1995). This confirms results using headphone reproduction (Chernyak 
and Dubrovsky, 1968; Dubrovskii and Chernyak, 1969). 


6.4. Speech Intelligibility and Clarity in Relation to the 
Temporal Factor (TF) and the IACC 

As far as speech intelligibility is concerned, a speech transmission index (STI) has 
been proposed by Houtgast, Steeneken, and Plomp (1980) in which the tempo- 
ral and monaural factors have been taken into consideration. In addition to such 
monaural effects, binaural effects may contribute to speech intelligibility as well 
as to speech clarity. For the purpose of examining the effects of the spatial factor on 
speech identification, speech intelligibility tests were conducted for a synthesized 
sound field changing the delay time of the single reflection under a constant STI 
condition (Nakajima and Ando, 1991). In order to obtain the suitable dynamic 
range of speech intelligibility and to examine effects of both temporal and spatial 
factors, each monosyllable was joined by both meaningless forward and backward 
noise maskers instead of continuous speech. 

Results are shown in Figure 6.15 which indicates that the speech intelligibility 
is increased with an increase in the horizontal angle of the single reflection to 
the listener and with a decrease in the delay time. In order to explain the results, 
including the spatial effect, there are two simple and comprehensible models to be 
examined. 

(1) The first is the IACC model as described by 

SI = fit ) + /(j), (6.5) 

where t = STI and 5 = IACC. From experimental results, we obtained the 
relations 


f(t) % Ilk + 15 1 2 + 50f 3 + 18 


and 


f(s) % 10.5 s - 2.5. 

As shown in Figure 6.16, SI scores calculated by Equation (6.5) agree well with 
measured ones. 

(2) Next, we can introduce BESTI, which is defined by 

BESTI = (STI(L), STI(R)} max , (6.6) 

where STI(L) or STI(R) indicate the STI at the two ears. As shown in Figure 6. 1 7, 
the measured SI scores may also be expressed by means of BESTI. 
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* 


Figure 6.15. Speech intelligibility for sound fields with single reflection as a function of 
the horizontal angle to a listener and as a parameter of the delay time (six subjects). Different 
symbols indicate different delay times, (o): 15 ms; (A): 30 ms; (□): 50 ms; and (•): 100 ms. 


Now the question arises as to which of the two models is more closely related 
to speech intelligibility for the sound field. A further experiment has been carried 
out for clarity judgments which are associated with speech intelligibility under the 
available conditions of varied IACC and fixed BESTI (Nakajima, 1992). On the 
other hand, conditions of fixed IACC and varied BESTI are physically unreal for 
the usual sound field in a room, so that an experiment may not be performed under 
these conditions. 

As shown in Figure 6. 1 8, the scale values of clarity, which were obtained by the 
paired-comparison tests (eight subjects), increases with decreasing IACC for fixed 
BESTI or STI. The change of the IACC is significant ( p < 0.025). Therefore, the 
scale value of speech clarity in the present experimental conditions is expressed by 
Equation (6.5) with two independent factors, i.e., the IACC as a spatial factor and 
the STI as a temporal factor, because no significant interference effects between 
the two factors are observed. A promising method of calculating the speech intel- 
ligibility in relation to the delay time of single reflection has been discussed based 
on the ACF mechanism in the model of the auditory— brain system, as described 
in Section 5.4. One remarkable finding is that the four factors extracted from the 
running ACF of source signals and sound fields, i.e., (1) the value of r e9 (2) the 
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SI score calculated — — [%] 


Figure 6.16. Relationship between the calculated SI scores by Equation (6.5) and measured 
scores. 



Best-ear ST I 


Figure 6.17. The SI scores measured in relation to the BESTI, defined by Equation (6.6). 
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0 0.5 1.0 

I ACC 


Figure 6.18. Scale values of clarity for continuous speech as a function of the IACC and 
as a parameter of STI. The STI is almost the same as the BESTI in this experimental 
condition with symmetrical loudspeaker arrangements for the reflections, but the speech 
clarity differed greatly due to the effect of the IACC (Nakajima, 1992). 


delay time of the first peak r \ , (3) the amplitude of the first peak <j >\ , and the power 
of the signal frame 0(0), play an important role in recognition of speech (Shoda 
and Ando, 1996; 1998). 
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Subjective Effects of Sound Field 
on Performers 


The theory of subjective preference allows a conductor or music director to chose 
a program of music that will be the best sound in a given concert hall. Previously, 
we have described how music of rapid movement with a short x e fits a concert 
hall with a short initial time-delay gap and a short subsequent reverberation time 
(Figure 7.7). Music of slow tempo with a long x e fuses in a hall with relatively 
long values for these two temporal factors. It is strongly recommended, therefore, 
that we choose music motifs to be performed in a given concert hall to be blended 
with the music program and with the sound field. 


7.1. Subjective Preference of Performer for Sound Field 
on the Stage 

As is discussed in Chapter 4, the preferred conditions for listeners are strongly 
dependent on the effective duration of the autocorrelation function of source sig- 
nals. Thus, it is assumed that the preferred delay time of reflection for performers 
on the stage depends on the different types of music sources. In order to support 
musicians by the stage reflections, Nakayama et al. (1984, 1986, 1988, 1988) ex- 
amined the preferred conditions of the sound field on the stage by use of seven 
alto-recorder soloists. 

The experiments were conducted by use of a system with delay lines and 
attenuators as shown in Figure 7.1. The single or two early reflections were simu- 
lated in an anechoic chamber by loudspeakers located at a distance of 1.7 m from 
the center of the head of subject (1.2 m from the floor). The adjustable factors 
were: 

(1) the autocorrelation function with two different music programs as shown in 
Figure 7.2. These two music selections, with tempos (d = 90 and J = 60) 
and with different values of r e , were composed by Tsuneko Okamoto for this 
investigation (Table 3.1); 
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Figure 7. 1. A diagram of the simulation system for a performer. 


(2) the amplitude of reflections relative to that of the direct sound, which is mea- 
sured at the ear-canal entrances, when the reflection is arriving from the frontal 
direction (£ = 0°); and 

(3) the angle of incidence of reflections, 

In the first study, adjusting the delay time of a single reflection ($ = 0°) with 
fixed amplitudes, eight subjects were asked to respond to the most preferred delay 
time [A t[] p . The dashed symbol signifies the physical factors for the condition of 
the performers due to the different definition of the amplitude of reflection. Results 
of the most preferred delay time for a single reflection are shown in Figure 
7.3. Clearly, the most preferred delay time for the single reflection differs sig- 
nificantly for the two music motifs played in accordance with the ACF r e , and 
increases with decreasing amplitude of reflection. For other angles of if the 


Figure 7.2. Music scores composed by Tsuneko Okamoto for this experiment and the 
measured normalized ACF of the two music motifs, 2 T = 32 s (Nakayama, 1 984). (a) Music 
scores of Motifs F and G; and (b) normalized ACFs of Music Motif F, x e = 105 ms and 
Motif G, r e = 145 ms. The values of r e are obtained by the extrapolation of the first 
important decay rate for 5 dB. 
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Amplitude of the single reflection [ d B] 

Figure 7.3. Most preferred delay time of a single reflection, [A t\] p as a function of the 
amplitude of reflection (Nakayama and Uehara, 1988; Nakayama, 1988). (o): Music Motif 
F; (•): Music Motif G. 


amplitude of reflection is adjusted to be of equal loudness, then the most preferred 
delay time remains the same as in Figure 7.3. 

Therefore, the most preferred delay time of the single reflection or the first 
reflection is well described by the duration t' of the ACF, similar to Equation 
(4.4), which is defined by 


[a/;u 


= T 


P ’ 


such that 


\4*p (^) I envelope ^ kA at T — , ("7.1) 

where the constants k = 2/3, c = 1/4, and A' is the total amplitude of reflections, 
being defined by A' = 1 relative to —10 dB of the direct sound as measured at 
the ear-canal entrance. This is due to the “missing reflection” phenomenon of 
performers (see Figure 6. 1 1). Thus, for example, A' = 1/4 for —22 dB and 1/16 
for —34 dB. 

If the envelope of the ACF is exponential, then Equation (4.5) may be applied, 
yielding 


^ = (log 10 


3 

2 


1 

4 


logjo A')T e . 


(7.2) 
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Figure 7.4 shows the relationship between the most-preferred delay time and the 
calculated value obtained by Equation (7.1) with the ACF-envelope and different 
amplitudes of reflection. The correlation coefficient between them is 0.99 (p < 
0.01). Table 7. 1 demonstrates the procedure for obtaining the most-preferred delay 
time of the reflection approximately, by the use of Equation (7.2). 

In order to obtain the preferred angles of incidence of a single reflection, 
paired-comparison tests were conducted with fixed loudness. The delay time 
of reflection was fixed to the most-preferred condition from Figure 7.3 for a 
constant amplitude of reflection. The resulting scale-values of preference as a 
function of the angle of incidence are shown in Figure 7.5. In this figure, the 
scale values for two music motifs were averaged, because no significant differ- 
ences between them were achieved. It is found that the important directions of a 
reflection maximizing preference are the reflection from the rear (§ = 180°) or 
from above ( r] = 90°). Thus, the location of reflection must be in the median 
plane (Nakayama, 1986). This result differs greatly from the condition of listeners 
who prefer the sound fields with a low value of the I ACC. These results recom- 
mend that reflections from the rear wall and canopy or ceiling of the stage must 
be carefully designed for musicians, and controlled for music programs during 
rehearsals. 



Figure 7.4. Relationship between the preferred delay of the single reflection and the dura- 
tion of the ACF calculated by Equation (7.1) (Nakayama, 1988). Different symbols indicate 
the averaged values obtained by different motifs with different amplitudes of reflection 
(A\ = — 10, —22, or —34 dB). (o): Music motif F; (•): Music motif G; and (A, A): results 
with two early reflections (A\ — A 2 — —22 dB). 
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Figure 7.5. Scale values of preference as a function of the horizontal angle of incidence f 
of a single reflection to the subject (Nakayama and Uehara, 1988; Nakayama, 1988). The 
angle >/ = 90° signifies that the single reflection arrives from above in the median plane. 


Table 7. 1 . Judged and calculated preferred delay times of a single reflection 
for alto-recorder soloists. Judged values of [A t[] p are shown in Figure 7.3, and 
calculated values of [Ar(] ; , are obtained by Equation (7.2) with values of r e . 


A [dB] 

A' [dB] 

( = A + 10) 

A' 

Judged 
[A/,'],, [ms] 
Motif F 

Motif G 

Calculated 
[A/,'],, [ms] 
Motif F 

Motif G 

-10 

0 

1.0 

20.5 

28.5 

18.5 

25.5 

-16 

— 6 

0.5 

25.5 

36.5 

26.3 

36.3 

-22 

-12 

0.25 

33.0 

47.0 

34.3 

47.4 

-28 

-18 

0.125 

40.0 

58.0 

42.2 

58.3 

-34 

-24 

0.063 

50.5 

62.5 

50.0 

69.0 
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Next, it will be shown that this result holds for a sound field with two early 
reflections. A second reflection (£ =0°, 54°, 90° or 126°; rj — 90°) was added to 
the single reflection at § = 180° to determine the preferred temporal and spatial 
conditions of the second reflection. The first reflection was fixed at the preferred 
condition due to Equation (7.1) with the total amplitude of reflections A' (see Table 
7.1). The scale values of preference obtained, for amplitudes of A\ — —27 dB 
and Ay = —33 dB (A' 2 = A — 6 [dB]), are shown in Figure 7.6 as a parameter 
of (At] — Af{). 

The most-preferred delay time of the second reflection may be roughly found 
as 


[Ar']„ % l-5[ArJ] 


(7.3) 



Figure 7.6. Scale-values of preference as a function of the delay time of the first reflection 
and as a parameter of the second reflection A t' 2 — ArJ, for two different music motifs 
(Nakayama, 1988). The most-preferred delay times of the first reflection calculated by 
Equations (7.1a, b) are 40 ms and 55 ms for Music Motifs F and G, respectively ( A t = —27 
dB, A 2 - A = —6 dB). 
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As shown in Figure 7.6, it is found that the preferred condition of the second 
reflection depends greatly on the source program played, but are hardly affected 
by the delay time of the first reflection Arf (p < 0.01) validating Equation (7.1). 
The preferred delay time between the direct sound and the second reflection [A rj — 
A/j']/, is about 0.5 [ Ar( ]/^ shorter than that of the first reflection [Equation (7.3)], 
and may be associated with 0.8[ Afi ] /? for the condition of the listeners. 

It is worth noting that the preferred incident angles of the second reflection are 
similar to the results of the first reflection, as shown in Figure 7.5. 


7.2. Influence of the Music Program Selection 
on Performance 

First of all, it is recommended that musicians select suitable music programs be 
fitting the concert hall with given temporal factors, the initial time-delay of the 
early reflection, and the reverberation time for listeners. Figure 7.7 shows the 
recommended ranges of reverberation time according to the range of r e of source 
signals for several music programs. It is quite natural that composers write music 
that images the acoustic condition of a certain room, with temporal factors repre- 
sented by “reverberance.” For example, music tempo and melody are slow enough 
for a pipe organ in a large cathedral with a long reverberation time. Chamber music 


Effective duration of ACF, x e [ms] 

0 50 100 150 


Speech 

Vocal 

music 

(a) 

Orchestra 

music 

(b) 

Pipe organ 
music 
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[ 7*sub ] p [s] 



Figure 7.7. Estimated ranges of the effective duration of the ACF of sound sources, and 
the preferred reverberation times for listeners. 



7.3. Selection of the Performing Position for Maximizing Listener’s Preference 


117 


is performed in a small concert hall with short reverberation time. It is an inter- 
esting fact that fast high tempo music of Mozart was to be performed in the guest 
rooms of Courts with small audiences. 

On the other hand, for music performers, the initial time-delay gap between the 
direct sound and the first reflection is controlled by adjusting the height of the 
canopies above the stage (Section 8.4), according to the ACF duration of music 
sources as discussed above. This kind of control may be exercised by a “sound 
coordinator,” in the rehearsal prior to the performance. 

Considering the fact that, for listeners, reflections from side walls including 
those on the stage are effective in decreasing the I ACC, and that reflections from 
above the stage or from the rear wall of the stage are very important for perform- 
ers, these conditions are typically realized at the same time without any serious 
contradictions. 


7.3. Selection of the Performing Position for Maximizing 
Listener’s Preference 

The performing position that minimizes the I ACC of the sound field for listeners’ 
seats is demonstrated here by an example (Mouri, Mori, and Ando, unpublished). 
Scale values of preference are calculated with Music motif B (Arnold; x e — 35 ms) 
at 1 12 listeners’ positions in a Bekesy Courtyard (Bekesy, 1934). For simplicity, 
the directivity of the sound source is assumed to be uniform in this calculation. 
The height of the sound source is 80 cm from ground level, and the height of the 
listeners’ ears is 1 10 cm. The reverberation time is 1.5 s. 

The contour lines of equal average value of the IACC for 1 12 listeners’ position, 
calculated to find the optimum performing positions, are plotted in Figure 7.8. The 
effective positions for performance may be found in the area minimizing the IACC 
for all of the listeners’ positions. This indicates the importance of the side walls 
(near the performing position) in decreasing the IACC in the audience area. The 
most effective performing position is indicated by the symbol [S] 7 , in Figure 7.9. 
When the sound source is located at [S]^, the orthogonal factors (except for the 
reverberation time and related scale values of the subjective preference at each 
listener’s position) are shown in Figure 7.9. The reverberation time used here is 
the one measured by Bekesy (1.5 s). The preferred seating positions are found in 
the area centered on the scale values of —0.65. This reveals a good sound field in 
the courtyard without reflection from above. Bekesy (1934, 1967) mentioned at 
that time (and nearly all listeners and musicians agreed), that the musical quality 
of the sound field in the courtyard was much better than in the concert hall where 
the orchestra usually played. 

Bosse (1997) emphasized the importance of blending music performance and 
the concert hall as the heart of music. Professional musicians may change the 
style of performance during rehearsal, blending the music and sound field in a 
concert hall in the manner discussed in Section 3.2. For a given concert hall, after 
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Figure 7.8. Contour lines of equal scale values of subjective preference calculated at each 
performing location averaged for 30 listening positions. The most effective location of 
music performance is found centered on [S] p . 



Figure 7.9. Contour lines of equal scale values of preference of listeners, when the 
performing position is located at the most effective source location [ S ] p . 
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construction, musicians will learn what kind of music fit with the sound field, and 
how to control the style of performance of the music to be played. The acoustic 
factors in a concert hall are not significantly changed by aging over a long-time 
period. 
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Passive Control of Sound Field 
by Design 


The acoustic design of each part of a room, improving the quality of the sound 
field at the listener’s position, is demonstrated here, based mainly upon the spatial 
criteria. 


8.1. Control of the IACC by Side Walls 

The most effective and widely accepted factor in subjective preference judgment 
is the IACC. The shape of the concert hall can be best designed by minimizing 
the IACC, and thus maximizing the subjective preference, of listeners at each seat 
for any kind of sound source. The most remarkable fact is that the average values 
of interaural cross-correlation dv (0) for five music signals (Motifs A— E) indicate 
a minima at an incident angle of about £ = 55° from the median plane to the 
listeners, as shown in Figure 8.1. Behavior of the values of d>//(0) and d> rr (0), 
which correspond to the average energies of the sound at the left and right ears, 
respectively, indicate a maximum difference at the angle centered on £ = 55°. 

In order to judge how a fundamental space may be formed, the IACC is calculated 
in the audience area by using the image method of a concert hall with a size of 
width VF, length L, and height //, as shown in Figure 8.2. The floor area was 
fixed, and the stage area was fixed at one-third of the seat area. As discussed in 
Section 8.5, to minimize the attenuation of direct sound, the seating floor was 
inclined by 12° from the horizontal plane. For simplicity, the sound source was 
placed at the center of the stage, 1.5 m above the floor, and 20% of the distance 
from the front side of the stage. The absorption coefficient of the seating area 
was assumed to be 0.65. The simulation calculation was performed at 80-100 seat 
positions with the receiving point placed at the height of the listener’s ear, 1.1m. 

The contour lines of equal IACC values (using Music Motif B) for each room 
shape are shown in Figure 8.3. It is well known that the IACC is increased near the 
sound source due to the strong direct sound. By narrowing the hall width VF, the 
IACC is decreased, and the seat area for IACC < 0.5 is increased. This indicates 
the importance of sound reflection from the side walls. In order to control the IACC 
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Figure 8. 1 . Average values [for five Music Motifs (A through E)] of the interaural cross- 
correlation, and the autocorrelation functions (r = 0) at the two ears for a single sound 
arriving from the horizontal angle £, which are needed for the calculation of the IACC(r = 
0). Values of the autocorrelation functions, O// (0) and <t> rr (0), correspond to sound energies 
at each ears. 


in the neighborhood of the source location, the angles of the side walls on the stage 
must be carefully designed (Section 10.1). 


8.2. Control of the IACC by Ceilings 

The effects of the variation of the ceiling surface angle, as shown in Figures 8.4 
and 8.5, were examined. Figure 8.4 shows the Auditorium at Kobe University and 
Figure 8.5(a) Boston Symphony Hall with a height H = 18 m and Figure 8.6(b) 
shows the Boston Symphony Hall when the height H is adjusted to 9 m. 

Similar to the above, the sound source was placed at the center of the stage and 
the subjective preference for the IACC values at specified seats were calculated. 
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Figure 8.2. Simple geometry of a simulated concert hall. 


The results of the scale values of the subjective preference S 4 due to the I ACC 
(described in Section 4.4) are shown in Figure 8.6. When the ceiling angle is 
ip = 30°, the sound field of the auditorium at Kobe University is much improved 
up to the quality level in the Boston Symphony Hall. Even if the height of the 
Boston Symphony Hall is adjusted to be much lower than the existing level, the 
results are similar to those of the existing hall with H — 18 m. The most important 
fact is that, when the ceiling shape is flat, then the IACC is maximized and the 
scale value of preference is accordingly minimized. 


8.3. Influence of Diffusers on Walls and Ceilings 

8.3.1. Number Theory for Diffusers 
to Avoid the Image Shift 

In the fundamental schemes of the concert hall described in the previous sections, 
Schroeder’s diffusers may be added to the center part of the ceiling to avoid a strong 
reflection from the median plane. The diffuser design is illustrated in Figure 8.7. 
The design-frequency range of this diffuser is 250-1750 Hz (Schroeder, 1979). 
The reflection patterns for the diffuser are demonstrated in Figure 8.8. At the 
present stage of scientific knowledge, the calculation of the IACC at each seat in 
a concert hall with the Schroeder diffusers is difficult. But, it is interesting that 
the measured IACCs are smaller at the center parts of a hall than those calculated 
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Figure 8.4. Auditorium at Kobe University with a ceiling angle ^ varied in simulation, 
improving the sound-field quality with respect to the spatial factor, I ACC. 


without diffusers, showing that the diffusers close to the stage may be effective in 
reducing IACC as well (Ando et ah, 1992). 

The Schroeder’s diffuser designed for the high-frequency range above 2 kHz 
may also be applied to side walls to avoid the image shift of the sound source for 
listeners. A deformed shape of this, excluding wells, was applied on the tilt-side 
walls in the Kirishima International Concert Hall (Miyama Conseru) as described 
in Section 10.1 . 

Since a significant amount of absorption due to the wells is found (Fujiwara, 
1997; Onitzuka and Kawakami, 1997), a surface structure without wells will be 
discussed below. 


8.3.2. Scattered Reflection by Uneven Surfaces 

Other possible diffusers that avoid the image shift are realized by means of obsta- 
cles such as triangular forms and circular arcs on the reflecting walls. Calculated 
reflection patterns are shown in Figure 8.9. In this calculation, the diffusers are 
assumed as an infinite periodic structure, so that the reflection patterns are discrete 
as indicated by the arrows in these figures (Masuda and Fujiwara, 1997). When 
the dimension of the periodical surface is finite (5 m), then a continuous reflection 
pattern may be obtained instead of a discrete pattern (Figure 8.10). When the 


;t>; 


on Symphony Hall with a ceiling angle vh variec 
uality with respect to the spatial factor, I ACC. 
(changed). 
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Figure 8.6. Scale-values of subjective preference S 4 in respect to the IACC (Music Motif 

B). ( ): Auditorium at Kobe University; ( ): Boston Symphony Half H — 18 m 

(original); and ( ): Boston Symphony Half H — 9 m (simulated). 


Ceiling 



Cross-dimension of hall — — 

Figure 8.7. The two-dimensional Schroeder’s diffuser based on quadratic residues of N — 
17 (Schroeder, 1979). 
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Figure 8.8. Measured and calculated reflection pattern for a scale model of the Schroeder’s 
diffuser, N = \1 (Strube, 1980; 1981). (a) X d /X = 1.0; (b) X d /X = 2.0; and (c) X d /X = 
3.0. 
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Figure 8.9. (a) Calculated directions of reflection for a two-dimensional periodic triangle 
shape; and (b) for the periodical circular-arc shape (Masuda and Fujiwara, 1 997). The height 
of both shapes is 20 cm, and the period is 100 cm. 


receiving point is close to the uneven surface, then a more diffusive pattern is 
observed, as shown in Figure 8.11. This is caused by a number of damped scattering 
waves that exist only near the uneven surfaces (Ando and Kato, 1976). 


8.3.3. Fractal Geometry for Desired Sound Reflections 

For a wider frequency range of reflections, fractal structures are proposed for the 
desired directions, as shown in Figure 8. 1 2. In order to control the proper reflection 
angle to the listeners, according to Figure 6.3 for each frequency band, a fractal 
structure (Mandelbrot, 1982) may be applied as shown in Figure 8.12(b) and (c). 
Calculated results are shown in Figure 8.13 (Ando and Sakamoto, 1988; Dai and 
Ando, 1 983). In order to minimize the IACC in each frequency band, three desired 
directions for the low-, middle- and high-frequency reflection may be realized by 
fractal geometry. For example, this kind of fractal geometry may be applied for 
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Figure 8.10. (a) Calculated directions of reflection for the finite length (5 m) of a periodical 
triangle shape; and (b) of a circular-arc shape (Masuda and Fujiwara, 1997). The receiving 
point is at the distance of 1 0 m from the center of the reflector. The amplitude of both shapes 
is 20 cm and the period is 100 cm. 


side walls on the stage, obtaining more lateral reflections for the low-frequency 
range (below 500 Hz), and smaller angles to the median plane for listeners at 
higher frequencies. The complicated structures with the wells may be deformed 
more effectively to avoid excess absorption. 


8.4. Reflectors near the Ceiling 

8.4.1. Transfer Function for Reflection from Various 
Single Panel Shapes 

Applying the Rubinowicz representation of the Kirchhoff diffraction integral, re- 
garded as the mathematical formulation of Young’s theory (Bom and Wolf, 1970), 
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Figure 8.11. Calculated directions of reflection for the finite length (5 m) of the periodical 
triangle shape (Masuda and Fujiwara, 1 997). The receiving point is varied from 5 m to 15m 
from the center of the reflector. The amplitude of both shapes is 20 cm and the period is 
100 cm. Circles denote the directional reflection levels shown in Figure 8.9(a). 


which converts the surface integration into a line integration around the contour of 
a reflector, the transfer function for reflection of single panel reflectors is calculated 
(Nakajima, Ando, and Fujita, 1992). In order to obtain flat- frequency character- 
istics for reflection from a single plate, three types of single reflectors with a 
constant area of 4 m 2 are examined. The source point and receiving point are lo- 
cated at 14.14 m from the center of the reflectors. Results of the transfer function 
are shown in Figure 8.14 (a— c), as a parameter of the angle of sound incidence 9 
to the center of the plates. It is observed that the amplitude fluctuation of the 
transfer function increases with the number of sides of the polygon. Dips in the 
transfer function result from the simultaneous arrival of negative boundary waves 
at the receiving point from the edges, as indicated by the impulse responses shown 
on the right of Figure 8.14. Therefore, a triangular reflector is recommended to 
obtain flat-frequency characteristics. Further results show that the interior angle 
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(a) 


x 



(b) 


Incident wave 



(c) 


Figure 8.12. Formation of a fractal geometry for a reflector for controlling the direction 
of reflection for each frequency band, (a) Basic structure; (b) form of fractal geometry; and 
(c) desired directions, 0 = 0 , —13.5 , and —29.8 of reflections for the low (L), middle 
(M), and high(H) frequency bands, respectively. 
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Fi 


Figure 8.13. Calculated directions of reflection for the fractal geometry (Figure 8.12(c)), 
as a function of the normalized frequency F, for the three frequency bands, L, M, and H 
(Ando and Sakamoto, 1988). The normalized frequency is given by F\ = f/fd\, fd i being 
the critical frequency between L and M to be designed. 


of the isosceles triangular reflector is recommended to be in the range of 90° and 

120 °. 


8.4.2 . Transfer Function for Panel Arrays 

In order to confirm the above results, the arrays shown in Figure 8.15 (a-c) 
composed of the three shapes of reflectors are examined. Each array has 35 panels, 
the total area of an array is 280 m 2 , total panel area 140 m 2 , so that the ratio of these 
areas is 50%. The transfer functions shown in these figures are calculated when 


Figure 8.14. Calculated transfer function and the impulse response for reflection from 
reflector as a parameter of the angle of sound incident to the surface, (a) Triangle reflector; 
(b) square reflector; and (c) circular (decagon) reflector. 





Figure 8.15. Calculated transfer function for the reflection of arrays from reflectors, (a) Ar- 
ray of triangle reflectors; (b) array of square reflectors; and (c) array of circular (decagon) 
reflectors. 


Geometrical reflection condition (A) No geometrical reflection condition (B) 
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the sound wave impinges the center of the array with the incident angle 0 = 45 c . 
In Figure 8.15, the solid and dashed curves represent the calculated results for the 
panel arrays and for single panels, respectively. 

Obviously, the dips in the transfer function of a triangular panel array in the 
low-frequency range (below 1000 Hz) are much smaller than the others. When 
a geometrical-ray reflection exists on a single panel, the transfer function in the 
high-frequency range is almost the same as that of the central single panel in the 
array. There are low-frequency components which do not exist for the reflection of 
a single panel. This phenomenon is caused by the diffraction effects of neighboring 
multiple panels as demonstrated in the next section. 

For further information, the solid lines in Figure 8.15 indicate Rindel’s esti- 
mation lines of the transfer function for a rectangular panel array (Rindei, 1986). 
The amplitude of transfer functions for panel arrays are in close agreement with 
Rindel’s estimation only in the case when the path of geometrical reflection exists 
on the center of the panel. 


8,4.3. Lateral Reflection Components from Canopies 

In the Boston Symphony Orchestra’s Tanglewood Music Shed, the canopy, which 
consists of nonplanar triangular panels, plays an important role in the decrease 
of the IACC, because there are no side-wall reflections (because of the fan shape 
of the Shed). The sides of the triangular panels range from 2.5 m to 8.0 m, and 
the opening spaces are triangular and of the same dimensions as the panels. The 
canopy is suspended about 6.5 m above the audience floor and extends over the 
stage as well as the frontal part of the audience area. Figure 8.16(a) is a typical 
example of the transfer function calculated for the panel array composed of 13 
nonplanar triangular panels located within the ellipse drawn in the figure. The 
related impulse response is shown at the bottom of the figure. Figures 8 . 1 6(b — 
g) show the transfer function for individual panels, if all the neighboring panels 
are removed. In these figures, 0 dB refers to the level of the direct sound from 
the source to a receiving point without any reflection. It is remarkable that rela- 
tively strong low-frequency components arrive from panels away from the median 
plane as shown in the transfer functions. These reflections are adequate to de- 
crease the IACC for the audio-frequency range. And, due to the high-frequency 
components from the panel in line above, it helps to avoid the image shift of 
the sound source, keeping the maximum value of the interaural cross-correlation 
function at the time origin, tjacc = 0- Also, the low-frequency components 
near 200 Hz may compensate for the large attenuation due to the interference 
between the direct sound and the reflection of the seat rows and the floor (see 
Section 8.5). 

In another example of an existing hall designed by Nakajima, Ando, and Fujita 
(1992), triangular reflectors are installed above and near the stage. Triangular 
reflectors with an angle of about 120° show the effective reflections for a wide- 
frequency range. When such reflectors are installed above the left and right stage. 
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Figure 8.16. Calculated transfer function for reflection from a panel array composed of 
the 13 nonplanar triangular panels within an ellipse indicated by “a,” which were installed 
in the Tanglewood Music Shed (a). The corresponding impulse response is indicated on 
the lower part of this figure. Calculated transfer function for reflection from each single 
nonplane triangular panel (b) — (g) within the ellipse without all neighboring panels (b)-(g). 
In the figures, 0 dB refers to the amplitude of the direct sound located at P s (l \ .6 m, 6.5 m, 
1 .5 m) to a receiving point at P (24.0 m, 0.0 m, 1 .5 m). 
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Figure 8.17. The IACC with music signal (Music Motif B) in an existing concert hall, 
(a) Measured values with the panel array composed of the seven nonplanar triangular panels 
installed above the stage; and (b) calculated values without the panel array. 


then the lateral reflections in the low-frequency range may serve to decrease the 
IACC. 

Results of the measured IACCs with the triangular reflectors above the stage are 
shown in Figure 8. 1 7(a), and the calculated IACC without the reflectors are shown 
in Figure 8. 1 7(b). When the reflectors are installed above the stage and/or near the 
stage, then the IACC values of seats close to the stage are decreased. According to 
the effective duration of the ACF of program sources, this kind of reflector above 
the stage is quite useful for musicians as well, supporting their performance by 
controlling the delay time of reflections from the height of canopy (Section 7.1). 


8.5. Floor Structure and Seating 

8.5.1. Low-Frequency Attenuation, and Effects of the 
Angle of Wave Incidence 

In order to find the low-frequency attenuation and the effect of the angle of wave 
incidence 0, the sound transmissions over seat rows are calculated (Ando, Takaishi, 
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and Tada, 1982). The angle of wave incidence is valid in the range of 0 = 70°— 
89°, and the acoustic admittance of the floor is kept constant at 0.2, which roughly 
corresponds to the absorption coefficient of the real floor with seats. Calculated 
results are shown in Figure 8. 18 as a function of the frequency and as a parameter 
of the angle of wave incidence. The low-frequency attenuation appears around 
100 Hz, which is independent of the angle of wave incidence. The sound pressure 
throughout the calculated frequency range decreases uniformly with an increasing 
angle of incidence. When the angle is kept smaller than 80°, then the excess 
attenuation remains less than 4 dB, except for the dip-frequency range. 

It has been demonstrated that the maximum attenuation in the dip-frequency 
range diminishes with increasing absorption by the floor. The attenuation phe- 
nomenon is demonstrated in an existing hall by the measurement of the impulse 
response over the seat rows, as shown in Figure 8.19. 


8.5.2. Effects of Under-Floor Cavities 

The large attenuation occurring at the dip-frequency range can be reduced by 
making the floor sound absorbent. To accomplish this in the low-frequency range, 
the effects produced by slit resonators installed under the floor are examined. 



Frequency — ■-[Hz] 

Figure 8.18. Calculated sound pressures over a seat rows with the angle of incidence q as 
a parameter (the period of the rows is 90 cm). The specific acoustic admittance of the floor 

surface is fixed at 0.2. ( ):70°; ( ):75°; ( ):80°; ( ):85°; and ( 

— ):89°. 
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Figure 8.19. Measured sound pressure over the seat row in an existing auditorium, 
0 = 85°. d 0 is a distance for the direct sound, from the source point to the listening 
position. 


Figure 8.20 shows the sound pressure at ear level (without seats which are 
not effective in the low-frequency range) as a function of the frequency and as a 
parameter of the angle of incidence. There are no significant dips, indicating that 
sound absorption by the floor with a slit resonator is sufficient to remove the sound 
transmission dip. This characteristic is also demonstrated in calculating the seat 
rows. 

A further practical alternative for improving the sound transmission over the 
seats and for decreasing the IACC is to design the space under the floor in a 
manner similar to that of the space above the ears. A typical example is shown 
in Figure 8.21, leaving an air gap, the bottom then has diffusing characteristics or 
similar shapes to the ceilings, as discussed in Section 8.2. This seating design has 
been adopted in the hall in Kobe, as described in Section 10.2. 
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Figure 8.20. Calculated sound pressure over a slit resonator under the floor with the angle 
of incidence 6 as a parameter without any seats (the period of the seats is 90 cm) (Ando, 

Takaishi, and Tada, 1983). ( ):1 ; ( ):21 ; ( ):41; ( )-61 - and (-- 

):70 . 



Diffusers 



Orchestra 
pit & l if t 


Figure 8.21. A schematic example of designing the structure under the floor [see also 
Figure 10.9(b)]. 
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Individual Listener Subjective 
Preferences and Seat Selection 


The minimum unit of audience is the individual. If each individual is satisfied by 
the environment, then the whole audience is satisfied. But the opposite is not true. 
Even if the condition of preference as a general standard is satisfied, as a general 
standard in the initial design, some listeners may not be satisfied. In this chapter, 
first of all, a method is discussed for determining individual subjective preference 
for the sound fields. Then, the results of individual differences are presented. As an 
example of this application, we now describe a seat selection system that enhances 
individual preference. 


9.1. Individual Preference According 
to Orthogonal Factors 


9.1.1. A Simple Method for Obtaining 
Individual Preference 

Considering the fact that members of the audience, including children and older 
people, are quite diverse, a method for subjective judgments should be as sim- 
ple as possible. For this purpose, the paired-comparison method is selected. 
Another method, for example, the method of the magnitude estimate, is too 
difficult for most people, except for skilled technicians in laboratory experi- 
ments. The paired-comparison method usually needs a number of judgments 
for a single pair. However, from a single observation datum for a set of sound 
fields, an approximate method is described to obtain the scale value of sub- 
jective preference. This method is based on the law of comparative judgment 
using the linear range of normal distribution between the probability and the scale 
value. 


Y. Ando, Architectural Acoustics 
© Springer- Verlag New York, Inc. 1998 
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The probability that a sound field B is preferred to another sound field A is 
expressed by 



(X d ) 

Zab = - — , (9.2) 

Vd 

(X d ) is the average scale value between the sound fields A and B, Xd = X b — X a , 
if ad is being used as the unit for the scale value: Od — 1. 

Thus, 


P{ B > A) = erfc \Z ab ], (9.3) 

Z ah = erfc -, [P(B > A)]. (9.4) 

The first-order approximation of the Taylor series of Equation (9.4) is given by 
z ah = > A) - 1 / 2 ). (9.5) 

The linear range can be obtained for 

0.05 < P( B > A) < 0.95. 

Let us now consider a number of sound fields given by F (/, j — 1,2,..., T), 
and suppose a single response from each pair for simplicity. Then the probability 
P(B > A) in Equation (9.5) is replaced by (Ando and Singh, 1996) 

l F 

P(i > j)= - (9-6) 

F 

where Yj = 1 responds to a preference of / over 7 , Y\ — Yj = 0.5 (/ = j), while 
Yj = 0 corresponds to a preference of j over i. In order to improve the precision 
of the probability P(i > 7 ), a certain minimum number of sound fields within 
the linear range are needed to keep the accuracy, high when using Equation (9.6). 
This is performed by a preliminary investigation, avoiding an extreme sound field 
outside the linear range. In this manner, the scale value 5/ = Z xj (i = 1,2,..., F) 
may be obtained approximately, when Z-,j with P(j > /) is obtained by Equation 
(9.5). 

Next, let us consider an error in a single observation. The poorness of fit for the 
model is defined by 

X>(/,y') 1*^/ ~ Sj I Poor 

\Si ~ Sj\ 


X = 


0 < X < 1 , 


(9.7) 
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where 


IS/ - Sj\ Poor = Sj ~ Si > 0 if 
= 0 if 

Thus, in spite of j being preferred over i (F, = 0), it is possible that Sj — 5,- < 0, 
and the amount \Sj — 5 y -|p 00 r is added, as in Equation (9.7). When i is preferred 
over j (Yj = 1), it is natural that S,- — Sj > 0, and the amount is not added to 
the numerator. The value of X corresponds to the average error of the scale value. 
This should be small enough, say, less than 10%. 

Another observation is that, when the poorness number is K according to the 
condition expressed by (9.8), then the percentage of violations d is defined by 

2 k 

d = x 100. (9.9) 

F(F — 1) 

9.1.2. Examples of Individual Preference 

Table 9. 1 indicates typical examples of preference judgments with a single subject. 
The number of simulated sound fields is F — 12, with variations of both the 
listening level and the IACC. The value 7} is the aggregated preference scores of 
each sound field. For the scale values listed in Table 9.1, the number of violations 
K = 6 thus, d — 9.1%, and X = 0.04 (Singh and Ando, unpublished). 

The results of scale values obtained as a function of the listening level and as 
a parameter of the IACC for a single subject with Music Motif A are shown in 
Figure 9.1. Almost parallel curves of values of the IACC are observed. This reveals 
that both the listening level and the IACC independently influence the subjective 
preference judgments. Hence, the scale values of preference may be described by 
each of the two factors, similar to the global preference with a number of subjects 
(Chapter 4). The most preferred listening level is always found to be close to 77 dBA 
for any value of the IACC. No interactive behavior may be found from the parallel 
curves due to change in the IACC, and similar curves relative to the listening level, 
in spite of the same right-hemispheric dominance (Sections 5.2 and 5.3). Smaller 
values of the IACC are always preferred regardless of the listening level. Thus, 
the scale values for the two factors are described, and are superposed in a manner 
similar to those described in Chapter 4. This kind of independence of the two 
factors was verified for all other 15 subjects participating. 

In addition, such an independent nature may be found for the other two factors, 
both associated with the left hemisphere (Sections 5.2 and 5.3), the subsequent 
reverberation time and the scale of dimension SD of the hall or Afj (~ 22 SD), 
as demonstrated and summarized in Figure 9.2 for a single subject. The most- 
preferred reverberation time is always near 1 .0 s for any value of SD, and the 
maximum preference is found near SD = 0.2 (Af) = 4.4 ms; Music Motif 
B). Such independent behavior may well be achieved by means of the analysis 
of variance. Results are shown in Table 9.2 indicating that contributions of the 
factors are substantial enough to describe the total scale value without interference 


Yj = 0 , 

Yi = 1. 


(9.8) 



Table 9. 1 . Example of scale values, Sj , estimated by aggregating the preference scores (0 or 1 ). The paired-comparison 
tests were conducted by changing both LL and IACC with Music Motif A (subject OS). 


4 ^ 

4 ^ 


Sound field 


LL[dB] 

IACC 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

T 

P(i > j) 

Si 

83 

0.98 

1 

0.5 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1.5 

0.13 

-0.94 

83 

0.72 

2 

1 

0.5 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

2.5 

0.21 

-0.73 

83 

0.39 

3 

1 

1 

0.5 

1 

0 

0 

0 

0 

0 

1 

0 

0 

4.5 

0.38 

-0.31 

80 

0.98 

4 

1 

1 

0 

0.5 

0 

0 

1 

0 

0 

1 

0 

0 

4.5 

0.38 

-0.31 

80 

0.72 

5 

1 

1 

1 

1 

0.5 

0 

1 

0 

0 

1 

1 

1 

8.5 

0.71 

0.52 

80 

0.39 

6 

1 

1 

1 

1 

1 

0.5 

1 

0 

1 

1 

1 

1 

10.5 

0.88 

0.94 

77 

0.98 

7 

0 

1 

1 

0 

0 

0 

0.5 

1 

0 

1 

1 

0 

5.5 

0.49 

-0.10 

77 

0.72 

8 

1 

1 

1 

1 

1 

1 

0 

0.5 

0 

1 

1 

1 

9.5 

0.79 

0.73 

77 

0.39 

9 

1 

1 

1 

1 

1 

0 

1 

1 

0.5 

1 

1 

1 

10.5 

0.88 

0.94 

74 

0.98 

10 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0.5 

0 

0 

1.5 

0.13 

-0.94 

74 

0.72 

11 

1 

1 

1 

1 

0 

0 

0 

0 

0 

1 

0.5 

0 

5.5 

0.49 

-0.10 

74 

0.39 

12 

1 

1 

1 

1 

0 

0 

1 

0 

0 

1 

1 

0.5 

7.5 

0.63 

0.31 
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Figure 9.1. Scale values of preference for each sound field obtained by the paired- 
comparison test as a function of the LL and as a parameter of the IACC (subject OS, 
Music Motif A). 


effects and errors. Under these conditions, the scale values of preference can be 
obtained by summation of the average values of each factor, because the scale 
value is linear. For all subjects tested, contributions of the factors are shown in 
Table 9.3. The amounts of the contribution of the factors differs by subject, but the 
total contribution is great enough to describe the scale value of preference, except 
for a few cases. 


9.1.3. Individual Preference Description 

Similar to the method described in Chapter 4 for a number of subjects, the scale 
values of subjective preference for each individual listener is also approximately 
expressed by (Ando and Singh, 1994) 

Si % -or,- |x/ 1 3/2 , ; = 1, 2, 3,4. (9.10) 

Therefore, the individual preference may be characterized by the coefficients 
(*i,i = 1, 2, 3, 4, along with the positive and negative values of every jc,- , and 
the most preferred values [ LL ] P , [At] ] p and [7^]^. For convenience, the positive 
and negative values of or,-, are averaged to obtain a single value. 

The results of the preferred values of [LL] P and <*/, (/ = 1, 4) are listed in 
Table 9.4, and the values of [SD]^ and [7^]/, are found in Table 9.5. Scale values 
of preference as a function of LL are shown in Figure 9.3 for individual subjects. 
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^"sub ^ [s ] 


Figure 9.2. Scale values of preference for each sound field obtained by the paired- 
comparison test as a function of 7 sub and as a parameter of the SD (subject OS, Music 
Motif B). 


Obviously, the preferred listening levels greatly differ for each subject, exceeding 
the range tested, 74 dBA to 83 dBA. For example, subjects OK, AK, and YU 
preferred levels below 74 dBA, and subjects MA and HY preferred above 83 dBA. 

The most remarkable results are shown in Figure 9.4, where all of the variation 
between individuals preferred a low value of the IACC without any exception and 
without regard for the two Music Motifs A and B used in the tests. 

As shown in Figure 9.5(a) with Music Motif A, three subjects preferred delay 
around [At\] p — 66 ms ([SD] /? = 3), but others preferred less than 22 ms 
([SD] /7 = 1) and some more than Aq = 134 ms. With Music Motif B as shown in 
Figure 9.5b, the most-preferred values are around [At\] p = 18ms([SD] /7 = 0.8) 
or less than 4.4 ms ([SD]^ = 0.2). 

As shown in Figure 9.6, great individual differences are also observed for sub- 
sequent reverberation time. With Music Motifs A, for example, several subjects 
preferred about 3.0 s, but others preferred more than 6.0 s or less than 1.5 s. With 
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Table 9.2. Examples of the analysis of variance (subject OS). 


Music 

Factors 

Degree of 
freedom 

Sum of 
square 

F-test 

P 

Contribu- 
tion (%) 

Motif A 

At\ (SD) 

3 

2.54 

6.16 

< 0.02 

21.4 


E su b 

3 

8.09 

19.64 

< 0.01 

68.2 


Residual 

9 

1.24 





LL 

3 

5.01 

41.47 

< 0.01 

52.3 


IACC 

2 

4.33 

53.71 

< 0.01 

45.2 


Residual 

6 

0.24 




Music B 

Afi (SD) 

3 

3.32 

22.26 

< 0.01 

28.6 


Tsub 

3 

7.83 

52.51 

< 0.01 

67.5 


Residual 

9 

0.45 





LL 

3 

5.01 

41.47 

< 0.01 

52.3 


IACC 

2 

4.33 

53.71 

< 0.01 

45.2 


Residual 

6 

1.66 





Music Motif B, however, all the participating subjects preferred values near 1 .0 s 
or less. 

As described in the theory of subjective preference in Section 4.4, the most- 
preferred conditions for average of listeners (16 subjects) are: [ A r i ] ^ = 55 ms 
(A = 3.7) and [7^]/, = 2.9 s for Music Motif A, [At\] p — 14.5 ms (A = 4.6) 
and [T sub ] p = 0.99 s for Music Motif B. The preferred initial time-delay gaps 
also differ greatly among the individuals. But these most-preferred conditions are 
related well to the effective duration of the ACF of the source signal (z e = 127 ms 
for Music Motif A; and z e = 43 ms for Music Motif B) for each individual. 

One of the reasons why such a great individual differences appeared in temporal 
factors and do not appear for the IACC, is that temporal factors have a greater 
influence on the individual brain during the time afterbirth in which the personality 
formed than on the spatial factor. 


9.2. Effects of Lighting on Individual Subjective 
Preference 

In order to gain knowledge of the interaction between lighting and an individual’s 
subjective preference for the sound field, the lighting was controlled along with 
the listening level (LL), and the time delay of the single reflection (Afj) (Ando, 
Watanabe, and Yamamoto, 1990). 



Table 9.3. Results of analyses of variance for all of subjects tested. 


Subject 

Music 

Contribution (%) 

At\ (SD) 7 sub Total 

LL 

I ACC 

Total 

OS 

Motif A 

21.4* 

68 . 2 ** 

89.6 

52.3** 

45.2** 

97.5 


Motif B 

28.6** 

67.5** 

96.1 

52.3** 

45.2** 

97.5 

MS 

Motif A 

57.7 

14.2 

71.9 

— 

— 



Motif B 

2.6 

92.2** 

94.8 

49.0 

24.4 

73.4 

HA 

Motif A 

9.8 

63.7** 

73.5 

20 . 8 * 

7 1 2 ** 

92.0 


Motif B 

44.5** 

51.5** 

96.0 

23.3** 

72.2** 

95.5 

KO 

Motif A 

16.0 

69.5** 

85.5 

25.1 

52.2** 

77.3 


Motif B 

4.7* 

91 7 ** 

96.4 

64.7** 

21.3* 

86.0 

SK 

Motif A 

63.5** 

12.3 

75.8 

53.6** 

39.8** 

93.4 


Motif B 

4 q** 

94.4** 

98.4 

29 7** 

66 . 8 ** 

96.5 

MA 

Motif A 

4.6 

79 5** 

84.1 

67.9** 

26.8** 

94.7 


Motif B 

1.7 

81.6** 

83.3 

36.4 

29.2 

65.6 

YA 

Motif A 

— 

— 


19.5* 

72.5** 

92.0 


Motif B 

5.2 

84.3** 

89.5 

61.4** 

34.0** 

95.4 

AK 

Motif A 

— 

— 


60.0** 

32.1** 

92.1 


Motif B 

14.2* 

79.0** 

93.2 

55.8** 

33.4* 

89.2 

TA 

Motif A 

— 

— - 


7.9 

83.9** 

91.8 


Motif B 

4.1 

88 . 8 ** 

92.9 

19.4 

71 4 ** 

90.8 

TN 

Motif A 

58.9** 

23.0* 

81.9 

47.0** 

44.9** 

91.9 


Motif B 

16.8* 

70.5** 

87.3 

9.8 

64.9* 

74.7 

FU 

Motif A 

7.8 

78.9** 

86.7 

42 1** 

52.1** 

94.2 


Motif B 

24.6** 

64.5** 

89.1 

58.8** 

22.8 

81.6 

MI 

Motif A 

5.6 

87.2** 

92.8 

45.8* 

39.3* 

85.1 


Motif B 

4.2 

92.3** 

96.5 

34.0* 

55.5** 

89.5 

HY 

Motif A 

— 

— 


82.6** 

10.7 

93.3 


Motif B 

19.5 

60.8 

80.3 

86.5** 

6.7 

93.2 

OK 

Motif A 

— 

— 


65.2** 

25.2* 

90.4 


Motif B 

— 

— 


85.0** 

8.8 

93.8 

MO 

Motif A 

— 

— 


29.2* 

60.5** 

89.7 


Motif B 

— 

— 


8.3 

60.4* 

68.7 

YU 

Motif A 

— 

— 


36.7* 

54.7** 

91.4 


Motif B 

— 

— 


44.0* 

43.0* 

87.0 


* p < 0.05 

** p < 0.01 
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Table 9.4. Preferred values of the LL and the weighting 
coefficients a, (/ = 1 and 4). 


Subject 

Motif A 
[LL], 

a\ 

a 4 

Motif B 
[LL], 

<*i 

G?4 

os 

78.0 

0.15 

1.12 

77.3 

0.06 

1.26 

MS 

78.1 

0.11 

1.03 

78.5 

0.11 

1.69 

HA 

76.5 

0.07 

1.31 

80.1 

0.07 

1.70 

KO 

75.3 

0.04 

1.12 

80.7 

0.11 

0.96 

SK 

78.3 

0.13 

1.22 

79.8 

0.10 

1.52 

MA 

> 83.0 

— 

0.56 

> 83.0 

— 

1.62 

YA 

80.0 

0.09 

1.17 

> 83.0 

— 

0.99 

AK 

< 74.0 

— 

1.22 

78.0 

0.13 

1.21 

TA 

78.5 

0.06 

1.40 

76.9 

0.08 

1.76 

TN 

79.6 

0.12 

1.01 

78.2 

0.05 

1.44 

FU 

77.2 

0.08 

1.22 

80.4 

0.08 

0.61 

MI 

78.5 

0.08 

0.95 

78.5 

0.09 

1.57 

HY 

> 83.0 

— 

0.29 

> 83.0 

— 

0.47 

OK 

< 74.0 

— 

1.06 

79.9 

0.15 

1.02 

MO 

76.8 

0.07 

1.21 

77.0 

0.06 

1.39 

YU 

< 74.0 

— 

1.07 

77.0 

0.10 

1.05 


Table 9.5. Preferred values of the factors SD (At \ ) and T su b and the weighting 
coefficients a,- (i — 2 and 3). 


Subject 

Motif A 
[SD], 

[7sub]/?[ s ] 

a 2 

& 3 

Motif B 
[SD], 

[jfsub],[s] 

&2 

a 3 

os 

< 1.0 

< 1.50 

— 

— 

0.74 

0.82 

3.38 

11.46 

MS 

— 

— 

— 

— 

1.05 

< 0.50 

1.44 

— 

HA 

2.56 

3.43 

2.46 

5.03 

< 0.20 

0.90 

— 

7.45 

KO 

> 7.0 

6.00 

— 

— 

< 0.20 

< 0.50 

— 

— 

SK 

< 1.0 

3.42 

— 

6.98 

< 0.20 

< 0.50 

— 

— 

MA 

> 7.0 

> 6.00 

— 

— 

0.80 

< 0.50 

1.61 

— 

YA 

— 

— 

— 

— 

0.88 

0.82 

2.20 

7.35 

AK 

— 

— 

— 

— 

0.76 

0.97 

2.32 

8.66 

TA 

— 

— 

— 

— 

< 0.20 

0.91 

— 

8.88 

TN 

2.45 

2.58 

2.18 

5.28 

< 0.20 

< 0.50 

— 

— 

FU 

3.72 

< 1.50 

4.12 

— 

0.76 

1.12 

1.91 

4.22 

MI 

< 1.0 

< 1.50 

— 

— 

< 0.20 

< 0.50 

— 

— 

HY 

— 

— 

— 

— 

< 0.20 

0.83 

— 

3.30 

TY 

< 1.0 

< 1.50 

— 

— 

< 0.20 

0.90 

— 

4.18 
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Figure 9.4. Individual differences of scale values of preference as a function of the I ACC. 
(a) Music Motif A; and (b) Music Motif B. 


9.2.1. Effects on the Preferred Listening Level 

As shown in Figure 9.7, a cotton screen transparent to both light and sound was 
used, resulting in a uniform illuminated environment. The intensity of the lighting 
varied over three levels, 2 lx, 35 lx, and 600 lx (brightness: 1 .0 ± 0.2, 13.1 ± 1.5, 
and 183.7 db 5.5, respectively). At the same time, the LL was controlled at five 
levels, —8 dB, —4 dB, 0 dB, +4 dB and +8 dB, in reference to the preferred LL 
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Figure 9.5. Individual differences of scale values of preference as a function of the (Af| 
22 SD). (a) Music Motif A; and (b) Music Motif B. 
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Figure 9.7. Diagram of controlling the lighting level and physical factors of the sound 
field. 


which was obtained by a method of adjustment as a preliminary test at 2 lx for each 
subject. This procedure is reasonable, because of the great individual differences 
in the preferred LL as mentioned in the previous section. The sound source was 
a speech signal, the reading of part of a poem (0.9 s). The subjects were seven 
students (six males and one female). Each subject was judged 16 times on asking 
up to 56 pairs extracted from 105 pairs in total [n(n — l)/2, n = 15]. 

The results of the scale value of the preference for the sound fields of each 
listener are shown in Figure 9.8. The significance level for the contributions of 
lighting to the preferred LL, obtained by the analysis of variance, are indicated 
in Table 9.6. The contributions of the lighting are less than 15%; however, four 
subjects A, B, D, and G, indicate significant results (p < 0.05) . The most-preferred 
listening levels of subjects A, B, D, and E are somewhat shifted toward weaker 
sound-pressure levels at the high illumination of 600 lx. The preferred illumination 
during listening to the speech differed for each listener but, overall, about 35 lx 
appeared to be acceptable. 

9.2.2. Effects on the Preferred Initial Time Delay 

The subjects participating were eight male students (different from those in the 
above examination). The time delay of the single reflection gap was varied: A t\ — 
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Table 9.6. Effects of lighting on the preferred LL. 


Subject 

Contribution 
of LL [%] 

Contribution of 
lighting [%] 

Preferred 
illuminance [lx] 

A 

85** 

9* 

2, 35 

B 

78** 

13* 

35 

C 

96** 

0 

600 

D 

86** 

8* 

— 

E 

<27** 

0 

— 

F 

90** 

4 

— 

G 

84** 

15** 

35 


* p < 0.05. 

** p < 0.01. 



Relative sound level — [ d B ] 


Figure 9.8. Scale values of preference of each subject as a function of the relative sound 
level (RL). 
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0 ms, 1 1 ms, 22 ms, 44 ms, and 88 ms (Ao = Ai = 1; x e — 12 ms). Other 
experimental conditions were the same as those of the experiments above. The 
results of the scale value of preference are shown in Figure 9.9, and the 
contributions of the lighting to the preferred Aq are indicated in Table 9.7. 
Despite large individual differences in preference judgments, for the sound 
field as mentioned above, almost all the subjects (except for subject O) indi- 
cated that the preferred values of A/j are highly independent of the lighting 
levels. 

Considering the fact that the change of lighting involves the right cerebral 
hemispheric dominance (Davis and Wada, 1974), no interference effects may 
be observed by a change in Afj which involves the left-hemispheric domi- 
nance, as discussed in Section 5.2.2. On the contrary, much more interference 
effects are observed between the listening level and the lighting, because both are 
right-hemispheric dominances (see Section 5.2.2). 


9.3. Inter-Individual Differences in Preference Judgments 

As mentioned in Section 9.1.3, the individual difference is identified by the most- 
preferred value of each physical factor and the weighting coefficient a,- (i — 1, 
2, 3, 4). In this section, the inter-individual difference is discussed with respect to 
the listening level (i = 1), such that 

5, % -a,ki| 3/2 . (9.11) 

where .Yi = LL — [LL] y , andaq is defined by a single value the positive and negative 
ranges ofvj . Here, fluctuations in terms of [LL] /; andoq are examined for each test 
series for each subject (Sakai, Singh, and Ando, 1997). In this investigation, the 
Music Motif B was used as a source signal, and ten test series were conducted for 
each subject. Thirteen male students (21—24 years of age) participated as subjects. 
They had no previous experience in preference judgments. Five sound-pressure 
levels 66, 72, 78, 84, and 90 dBA were chosen to cover the preference range of the 
subjects participating. 

In order to obtain values of [LL] /? and a \ , a preference curve is drawn from the 
scale values obtained by the use of Equation (9.1 1) as shown in Figure 9.10. In 
this example, [LL] P = 75.0 dBA and ot\ — 0.028. The value of a\ is ob- 
tained by the quasi-Newton method. All of the data were arranged in this way, 
and the resulting values of [LL] p and oq are shown in Figures 9.1 1 and 9.12, 
respectively, for each subject. Inter-individual differences of subjects C, E, F, 
J, and K were quite large, but not in the other cases. In order to discuss the 
reasons why such a large inter-individual difference was observed, scale values 
obtained in the ten test series for two extreme subjects K and G are plotted in 
Figure 9.13(a, b), in which values of [LL] /? are shifted to 0 dB without any loss of 
information. 

Obviously, if the value of is small, as it is for subject K, then the curves 
of scale values are rather flatter than those for subject G, and the most-preferred 
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Table 9 . 7 . Effects of lighting on the preferred A t \ . 



Contribution 

Contribution of 

Preferred 

Subject 

of Ar, [%] 

the lighting [%] 

illuminance [lx] 

H 

97** 

0 

— 

I 

<27** 

0 

— 

J 

92 ** 

0 

— 

K 

96 ** 

0 

— 

L 

7 1 ** 

10 

( 2 , 35 ) 

M 

63** 

1 

— 

N 

97** 

0 

— 

0 

94** 

3 * 

35 


* p < 0.05. 

** p < 0.01. 

( ) Only for the range of A t\ > 1 1 ms. 



0 11 22 44 88 0 11 22 44 88 A 11 22 44 88 


Delay time, Ati [ms] 


Figure 9.9. Scale values of preference of each subject as a function of A t\. Symbols are 
the same as those indicated in the upper right part of Figure 9.8. 
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Figure 9. 10. An example of the regression curve for scale values of subjective preference 
(subject G, Test series 2). [LL]^ is found at 75.0 dBA. 


listening levels are barely determined. On the other hand, for subject G, the value 
of or i is quite large and critical to determining the preferred level. Thus the inter- 
individual differences did not fluctuate depending on the number of test series. For 
all such critical subjects, the values of a\ were greater than 0.02. The relationship 
between the range of fluctuation on the preferred listening level and the value of 
a\ is demonstrated in Figure 9.14. 



Subject 

Figure 9.11. Results of inter-individual differences of [LL]^ for each subject. Arrows 
indicate values of [LL] /; that were out of the range 66 dbA to 90 dBA that was examined. 
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Subject 


Figure 9.12. Values of weighting coefficient o?i obtained from ten test series for each 
subject. 


9.4. Seat Selection System for Individual Listening 

9.4.1. Seat Selection System Enhancing 
Individual Preference 

In order to maximize the individual subjective preference for each listener, a special 
facility of a seat selection system, testing his/her own subjective preference, was 
first arranged in use at the Kirishima International Concert Hall in 1994 (Sakurai, 
Korenaga, and Ando, 1 997). The sound simulation is based on the system described 
in Section 3.5, with multiple loudspeakers as simplified in Figure 3.8. The system 
used allows for four listeners testing the subjective preference of the sound field 
at the same time. Since the four factors of the sound field influence the preference 
judgments almost independently, as was discussed in Section 9.1, each single factor 
is varied while the other three are fixed at nearly the most preferred conditions for 
a number of listeners. Results of the testing by acousticians who participated in the 
First International Symposium on Music and Concert Hall Acoustics ( MCHA95 ), 
which was held in Kirishima, in May 1995, are presented here. 


9.4.2. Test Results of Individual Preference 

The music source was orchestral, the “Water Music” by Handel; the effective du- 
ration of the autocorrelation function, z e , was 62 ms. The total number of listeners 
participating was 106. Typical examples of the test results as a function of each 



160 9. Individual Listener Subjective Preferences and Seat Selection 



Figure 9.13. Examples of the scale values of preference obtained from ten test series. 
Different symbols correspond to results from the ten test series, (a) Subject G with a large 
value of a i (= 2.6 x 10 -2 ); and (b) subject K with a small value of ct\ (= 1.7 x 10~ 2 ). 


factor for the listener BL are shown in Figure 9.15. Scale values of the listener were 
close to the averages previously obtained: the most-preferred [LL] /? is 83 dBA, 
[A t\] p is 26.8 ms (the preferred value calculated by Equation (4.5) was 24.8 ms, 
where [Afj] /; ^ (1 — log 10 A)r e , A = 4.0), and the most-preferred reverberation 
time was 2.05 s (the preferred value calculated by Equation (4.7) is 1 .43 s). Thus, the 
almost-center area of seats was preferred for listener BL as shown in Figure 9. 16. 
Examples of the preferred value of each factor and the weighting coefficients for 
five listeners are listed in Table 9.8. With regard to the IACC, which is not listed 
in this table, it was the result for all listeners that the scale value of preference 
increased with the decreasing IACC value. Since listener KH preferred a very 
short delay time of the initial reflection, his preferred seats were located close to 
the boundary walls as shown in Figure 9.17. Listener KK indicated a preferred 
listening level exceeding 90 dBA (Table 9.8). For this listener, the frontal seating 
area close to the stage was preferable, as shown in Figure 9.18. For listener DP, 
whose preferred listening level was a rather weak 76.0 dBA, and preferred the 
initial delay time short; 15.0 ms, so that the preferred seats were in the rear part of 
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Table 9.8. Examples of preferred value of each factor and the weighting 
coefficient represented by the individual difference in subjective 
preference. 



Preferred 

Preferred 

Preferred 

a l 

a 2 

ot 3 

a 4 

Subject 

LL[dBA] 

Atj [ms] 

Tsub[£>] 

[10- 2 ] 




BL 

83.0 

26.8 

2.05 

6.0 

1.86 

1.46 

1.96 

KH 

83.0 

6.0 

1.29 

6.0 

0.74 

1.48 

2.49 

KK 

> 90.0 

21.0 

1.84 

1.0 

1.39 

0.83 

2.02 

DP 

76.0 

15.0 

1.77 

7.0 

1.34 

1.77 

2.49 

CA 

83.0 

> 100.0 

1.27 

1.0 

0.30 

1.45 

2.84 



Figure 9.14. Relationship between the range of fluctuation of [LL] /; and the weighting 
coefficient ct \ . 


the hall as shown in Figure 9.19. The preferred initial time-delay gap for listener 
AC exceeds 100 ms, but was not critical, as indicated by the value of a 2 = 0.30. 
Thus any initial delay times are acceptable but the IACC is critical. Therefore, the 
preferred area of seats was located only in the center, as is shown in Figure 9.20. 


9.4.3. Most-Preferred Conditions for Individuals 

Cumulative frequencies of the preferred values with 106 listeners are shown in 
Figures 9.21 to 9.23 for three factors. As indicated in Figure 9.21, about 60% of 
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- 1 50 - 0.80 - 0.07 0.62 1.32 


(b) log Ati/[Ati ] p 



0.0 0.40 0.75 1.0 

(d) I ACC 

Figure 9.15. Scale values of preference obtained by tests for the four factors of the subject 
BL. (a) The most-preferred listening level is 83 dBA, the individual weighting coefficient 
in Equation (9.10): ot\ = 0.06. (b) The preferred initial time-delay gap between the direct 
sound and the first reflection is 26.8 ms, the individual weighting coefficient in Equation 
(9.10 ):q? 2 = 1.86, where [At]] p calculated by Equation (4. 5) with z e = 62 ms for the music 
used (A — 4.0) is 24.8 ms. (c) The preferred subsequent reverberation time is 2.05 s, the 
individual weighting coefficient in Equation (9.10): a 3 = 1.46, where [7^]^, calculated 
by Equation (4. 1 1) with x e = 62 ms for the music used, is 1.43 s; (d) Individual weighting 
coefficient in Equation (9.10): a 4 = 1.96. 
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Figure 9. 1 6. Preferred seat area calculated for subject BL. The seats are classified in three 
parts according to the scale values of preference calculated by the summation 5] through 
S 4 . Black seats indicate preferred areas about one-third of all seats in this concert hall for 
subject BL. 
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Figure 9.17. Preferred seat area calculated for subject KH. 
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Figure 9.18. Preferred seat area calculated for subject KK. 
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Figure 9. 19. Preferred seat area calculated for subject DP. 
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Figure 9.20. Preferred seat area calculated for subject CA. 
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listeners preferred the range 80 dBA to 84.9 dBA in listening to music, but some of 
the listeners indicated that the most-preferred LL was above 90 dBA, and the total 
range of the preferred LL was scattered, exceeding a 20 dBA range. As shown in 
Figure 9.22, about 45% of listeners preferred the initial delay times 20 ms to 39 ms 
which were around the calculated preference of 24.8 ms (Equation 4.5); some of 
the listeners indicated 0 ms to 9 ms and others more than 80 ms. With regard to the 
reverberation time, as shown in Figure 9.23, about 45% of listeners preferred 1 .0 s 
to 1 .9 s, which values are centered on the calculated preferred value of 1 .43 s, but 
some listeners indicated preferences less than 0.9 s or more than 4.0 s. 

Both the initial delay time and the subsequent reverberation time appear to be 
related to a kind of “liveness.” Thus, it is assumed that there is a great interference 
effect between these factors for each individual. However, as shown in Figure 9.24, 
there is little correlation between the values of [At\] p and [T su b]^ (correlation is 
0.06). The same is true for the correlations between the values of [T su b] /7 and [LL] ; ,; 
and for correlation between the values of [LL] ; , and [Af]] /? , a correlation of less 
than 0.1 1. Figure 9.25 shows the three-dimensional plots of the preferred values 
of [LL] /7 , [A t\] p , and [T su b] /? . Looking at a continuous distribution in preferred 
values, no groupings of individuals can be seen to emerge from the data. 

Cumulative frequencies of values of the or/ for four factors, / = 1, 2, 3, 4, are 
shown in Figures 9.26 to 9.29. These values signify weights for the four factors 
of each listener. For instance, when the value of o'] is smaller than 0.02, then the 
listening level is insignificant for that listener, as discussed in the previous section. 

Examples of the correlation between the values of a\ and 014 (both spatial fac- 
tors), and the correlation between the values ai and <23 (both temporal factors) are 
demonstrated in Figure 9.30, in which the values of correlation are less than 0.04. 
Since there are no correlations between the values of 07 and a j (i / 7), a listener 
indicating a relatively small value of 07 for one factor, will not always indicate 
a relatively small value for the other factor. Thus, a listener is critical about pre- 
ferred conditions as a function of certain factors, while insensitive to other factors, 
resulting in a characteristic individual difference distinct from other listeners. 




Figure 9.22. Cumulative frequency of the preferred initial time-delay gap between the 
direct sound and the first reflection [At] ] p (106 subjects). About 45% of subjects preferred 
in the range of 20 ms to 39 ms. The calculated value of [At]] p by Equation (4.9) is 24.8 ms. 
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0 - 0.9 1 . 0 - 1.9 2 . 0 - 3.9 4 . 0 - 


[ Tsub ] p [s] 

Figure 9.23. Cumulative frequency of the preferred subsequent reverberation time [T sub ] /7 
( 1 06 subjects). About 45% of subjects preferred the range of 1 .0 s to 1 .9 s. Calculated value 
of |T sub ] y , by Equation (4.10) is 1.43 s. 


i/> 



Figure 9.24. Relationship between preferred values of [ Ari ] p and [7;^]^ for each subject. 
No significant correlation between values was achieved. 
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Figure 9.25. Three dimensional illustration of the preferred physical factors of the sound 
field for each subject. Preferred conditions are distributed in a certain range of each factor 
so that subjects could not be classified into any group. 
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Figure 9.26. Cumulative frequency of the weighting coefficient a } of 106 subjects . 
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Figure 9.27. Cumulative frequency of the weighting coefficient a 2 of 106 subjects. 


Cumulative frequency — *“ | Cumulative frequency 


9.4. Seat Selection System for Individual Listening 173 



9,28. Cumulative frequency of the weighting coefficient of 106 subjects. 



a 4 


Figure 9.29. Cumulative frequency of the weighting coefficient a 4 of 106 subjects. 
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Case Studies of Acoustic Design 


The purpose of this chapter is to demonstrate how the acoustic design of a concert 
hall and a multiple purpose auditorium is developed by means of case studies. As 
an example of outdoor spaces, the physical properties of a forest with multiple 
scattering phenomena are discussed in Chapter 11. There, it will be seen that the 
effects of scattering by a number of columns in an enclosure or by a number of 
trees in the outdoor space as elements of acoustic design may well be understood. 


10.1. Concert Hall Design 

The principle of the superposition of the scale value of subjective preference, 
with the optimal values of the four objective acoustics factors, can be applied to 
determine the total preference value at each seat. Comparison of the total preference 
values for different configurations of a concert hall allows us to choose the best 
design for a specific performance space, such as for a certain music program. 

Procedures for designing sound fields in a concert hall are illustrated in 
Figure 10.1. Temporal factors and spatial factors are carefully designed, in order 
to satisfy both left and right human cerebral hemispheres for each listener, for the 
conductor, and for each musician on the stage. 

(1) Temporal Factors for Listeners 

First of all, the purpose of a concert hall under planning is determined by a classi- 
fication of the music to be performed, with respect to a range of r e . The planning 
is associated with other facilities existing near the site, and with the location of 
the concert hall under design. If the space is designed for the performance of a 
pipe organ, the temporal factors At\ and T su b are determined by the range of x e 
which may be selected to be, say, centered about 200 ms (T mb — 4.6 s, Section 
4.3). When it is designed for the performance of chamber-music, the range of r e is 
selected near the value of 65 ms (r su b ~ 1.5 s). The conductor or the sound coor- 
dinator selects suitable music motifs with a satisfactory range of r e of the ACF to 
achieve a music performance that blends the music and the sound field in the hall. 
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© Springer- Verlag New York, Inc. 1998 


175 



176 


10. Case Studies of Acoustic Design 



Figure 10.1. Acoustic-design procedure maximizing the scale value of both temporal 
factors (TF) and spatial factors (SF) for the sound field in a concert hall, enhancing the 
satisfaction for both human cerebral hemispheres. The specialization of the left hemisphere 
for temporal factors and the right hemisphere for spatial factors for each listener, conductor, 
and performer are taken into consideration. 


The information for the ACF of music signals, and related subjective attributes 
are integrated. Moreover, in order to adjust the preferred initial time-delay gap 
for each music performance location, the position of each instrument is carefully 
placed on the stage. For instance, if the values of x e for violins is shorter than that 
of contrabasses with mainly the low-frequency ranges, the position of violins is 
shifted closer to the left wall on the stage and the position of contrabasses is shifted 
closer to the center as viewed from the audience. 

(2) Spatial Factors for Listeners 

As discussed in Chapter 8, the IACC should be kept as small as possible, main- 
taining tiacc = 0. This is realized by suppressing the strong reflection from the 
ceiling, and by appropriate reflections from the side walls at particular angles. If 
the source signal contains mainly the frequency components around 1 kHz, the 
reflections from the side walls are adjusted to be centered at roughly 55° to each 
listener, measured from the median plane of the listener. Under actual hearing 
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conditions in an existing hall, the perceived IACC depends on whether or not 
the amplitudes of reflection exceed the hearing threshold level, in addition to the 
physical value given by Equation (3.25). Thus, a more diffuse sound field may be 
perceived with increasing power of the sound source. For example, Keet (1968) 
reported the apparent source width (ASW) increases with an increase in the lis- 
tening levels. While the source is weak enough, hearing only the direct sound, 
the actual IACC being processed in the auditory— brain system approaches unity, 
resulting in no diffuse sound impression. In general, small values of the IACC 
have to be realized by early strong reflections only. If the sound source is located 
on the center line on the stage, then coherent signals arrive at the same time from 
both the side walls. From this point of view, acoustical-asymmetric properties in 
shaping the hall may create further advantages. 

(3) The Sound Field for Musicians 

For music performers (alto-recorder), the temporal factors are much more critical 
than the spatial factors (Section 7.1). Since musicians perform over a sequence 
of time, reflections with a suitable delay time relative to the values of r e of the 
source signals is of particular importance (for cellists; Sato, Ota, and Ando, 1998). 
Without any spatial subjective diffuseness, the preferred directions of reflections 
are in the median plane of music performers, resulting in IACC ^ 1 .0. In order to 
satisfy these acoustic conditions, some design iterations are required, maximizing 
the scale values for both musicians and listeners, and leading to the final scheme 
of the concert hall (Section 7.2). 

(4) The Sound Field for the Conductor 

It is recommended that the sound field for the conductor should be designed as 
that of a “listener” with the appropriate reflections of the side walls on the stage 
(Meyer, 1995). 

(5) Fusing Acoustic Design with Architecture 

From the historical viewpoint, architects have been perhaps more concerned with 
spatial criteria from the visual standpoint and were less so with the temporal criteria 
for blending human life and the environment under design, while acousticians have 
mainly been concerned with the temporal criteria, represented by the reverberation 
time from the time of Sabine ( 1 900 onward). There has existed no theory of design, 
by including the spatial criterion represented by the IACC, so that discussions by 
acousticians and architects were never on the same subject. As is described by the 
general theory of physical environments in Chapter 12, both temporal and spatial 
factors are deeply concerned with both acoustic design and architectural design. 

As an initial design sketch of the Kirishima International Concert Hall, 
Kagoshima, Japan, a plan, shaped like a leaf (Figure 10.2) was presented at the 
first meeting for further discussion with the architect, Fumihiko Maki, with the 
explanation of temporal and spatial factors of the sound field (Ando, 1985). After 
some weeks, Maki presented a revised scheme of the concert hall as shown in 
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(a) 

Wide aisle 



(b) 


0 10 20 [m] 


Figure 1 0.2. A leaf shape of the plan proposed for the Kirishima International Concert Hall, 
Kagoshima, Japan, (a) Original leaf shape (asymmetry); and (b) proposed asymmetrical 
shape for the plan. The sound field for seats in the circle are well designed for reflections 
from the walls on the stage and side walls. 


Figure 10.3 (Maki, 1994(a, b); Maki, 1997; Ikeda, 1997). Without any change of 
plan and cross sections, the calculated results indicated excellent sound fields, as 
shown in Figures 10.4 and 10.5 (Nakajima and Ando, 1997). 

The final architectural schemes, together with the special listening room for 
testing individual preference of the sound field and selecting the appropriate seats 
for maximizing individual preference of the sound field, are shown in Figure 10.6. 
In these figures, the concert courtyard, the small concert hall, several rehearsal 
rooms, and dressing rooms are also indicated. The Kirishima International Concert 
Hall under construction is shown in Photo 10.1, in which the leaf shape may be 


seen. 



Figure 10.4. Calculated acoustic factors at each seat, (a) Listening level; (b) At, initial time-delay gar 

first reflection; (c) 4-value, the total amplitude of reflections; and (d) 1ACC. The reverberation time desia 
band. 6 




10.2. Multiple-Purpose Auditoria 1 8 1 


(6) Details of Acoustic Design 

(a) For Listeners on the Main Floor 

In order to obtain a small value of the IACC for most listeners, the ceilings consisted 
of a number of triangular plates with adjusted angles (Section 8.2), and the side 
walls were given a 10% tilt with respect to the main-audience floor (Section 8.1), 
as is shown in Figure 10.7 as well as in Figure 10.3. In addition, diffusing elements 
are designed on the side walls to avoid the image shift of sound sources on the 
stage caused by the strong reflection in the high-frequency range above 2 kHz. 
These diffusers on the side walls are designed by a deformation of the Schroeder’s 
diffuser described in Section 8.3, taking the wells away, as shown by the detail of 
Figure 10.8(a). 

(b) For Music Performers on the Stage 

In order to provide reflections from places near the median plane of each of the 
performers on the stage, the back wall on the stage is carefully designed as shown 
at the lower left in Figure 10.7. The tilted back wall consists of six subwalls with 
angles adjusted to provide the appropriate reflections within the median planes of 
the performers. It is worth noticing that the tilted side walls on the stage provide 
good reflections to the audience sitting close to the stage, resulting in a decrease of 
the IACC. Also, the side wall may provide the effective reflection (arriving from 
the back) for a piano soloist. 

(c) Stage Floor Structure 

For the purpose of suppressing the normal-mode vibration (Morse, 1948) of the 
stage floor and an anomalous sound radiation from the stage floor during perfor- 
mances, the joists form triangles without any neighboring parallel structure, as 
shown in Figure 10.8(b). The thickness of the floor is designed to be thin, 27 mm, 
in order to radiate sound effectively through the vibration from instruments like the 
cello and contrabass. During rehearsal, music performers may control radiation 
power somewhat by adjusting their positions and/or by the use of a rubber pad 
between the floor and the instrument. There are, however, many problems to be 
solved for a more adequate design of the floor vibrations and the related sound 
radiation from the floor in connection with the direct sound radiation from the 
musical instrument itself. This hall was opened in July 1994 (Photo 10.2) 


10.2. Multiple-Purpose Auditoria 

As discussed in Chapter 9, all the listeners tested preferred a small value of the 
IACC as a spatial factor of the sound field; therefore, the IACC at each seat must be 
controlled by the spatial characteristics of the reflectors in the room. The main pur- 
pose of the auditorium to be designed is accommodated by the association of likely 
source signals with the values of the effective duration of the ACF. For example, 
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Figure 10.5c, d. Calculated total scale values at each seat, (c) Performing position: stage 
front with Music Motif A. (d) Performing position: stage rear with Music Motif B. The 
performing position of the stage rear part near to the wall is recommended. 
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Figure 10.6c. The final scheme of the Kirishima International Concert Hall, Kagoshima, 
Japan designed by the architect Maki (1994a, b). (c) Cross-section. 



Photo 10.1. The Kirishima International Concert Hall, Kagoshima, Japan, under construc- 
tion. The leaf shape may be seen in the center part (photo by Yamamoto). 
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(a) 


i i i i 1 1 

0 10 20 30 40 50 [cm] 



I 1 1 1 I I 

(b) 0 1 2 3 4 5 [m] 

Figure 10.8. Detail of the diffusing side walls effective for the higher-frequency range 
above 1.5 kHz, avoiding the image shift of the sound source well on the stage. The surface 
is deformed from the Schroeder’s diffuser by removal of the partitions (a). Detail of the 
triangular joist arrangement for the stage floor, avoiding anomalous radiation due to the 
normal modes of vibration by some musical instruments touching the floor (b). 
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Photo 10.2. The Kirishima International Concert Hall, Kagoshima, Japan, with tilt side 
walls and triangular ceilings (photo by Sakimoto). 


when the hall is designed for speech with x e — 22 ms, then the reverberation time 
may be determined by maximizing the subjective preference centered on about 
0.5 s 23 x 22 ms, occupied). In the use of the auditorium for multiple pur- 
poses, however, the subsequent reverberation time must be controlled, conforming 
to the value of x e of the program source. 

Acousticians have attempted to control the IACC, and the subsequent reverber- 
ation time by changing either the absorption coefficient of the boundary, the total 
cubic volume of the room, or both. In addition, some acoustical designers modified 
these factors may by an electroacoustic system, enhancing the sound field in the 
auditorium. 


10.2.1. A Round-Shaped Hall 

Since the time of ancient Greece and Rome, architects attempting to design 
circular-shaped auditoria have met with some acoustic difficulty. Nevertheless, for 
the purpose of some events and fashion shows, the round-shaped hall is designed 
like an “unidentified flying object (UFO).” 

In order to control the acoustic factors, due to the subjective preference theory 
described in Chapter 4, a number of acoustic elements are considered, as shown in 
Figure 10.9(a, b) (Takatsu, Mori, and Ando, 1997; Takatsu, Hase, Sakurai, Sato, 
and Ando, 1998). 
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Figure 10.9. Detail of ceilings (a), and cross-section (b) of the round-shaped event hall 
for the Fashion Plaza, Kobe. A number of elements are designed for both of the left- 
hemispheric-temporal factors and the right-hemispheric-spatial factors. 


( 1 ) Reflectors above the stage are designed with adjustable tilt angles. The effects 
of the tilt angle on the values of the I ACC are shown in Figure 10.10. The left-hand 
part of the figures indicates the contour lines of equal IACC values calculated 
with the horizontal reflectors, and the right-hand part indicates those with the 
appropriate tilt angle of the reflectors. Obviously, the tilt reflectors are effective 
in decreasing the IACC throughout the audience floor of 400 seats. Figure 10.1 1 
shows the interaural cross-correlation function calculated for the 500 Hz octave 
band range. These results ensure T| AC c = 0, so that no image shift of the sound 
source occurs. 
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Room for 

parents with small children (b) 


Figure 10.10. Effects of reflectors above the stage on the IACC calculated at each seat, 
(a) Horizontal reflectors above the stage; and (b) reflectors tilted above the stage as shown 
in Figure 10.9(b). 


(2) For the purpose of decreasing the IACC over a wide range of frequency, and 
based on the most suitable direction of reflections for each frequency band as shown 
in Figure 6.3, a number of convex reflectors are distributed asymmetrically, close 
to the ceilings. These are used as lighting covers as well. The smaller diameters of 
the convex reflectors are placed in the center of the ceiling to obtain reflections of 
a higher- frequency range (Photo 10.3), because the appropriate direction of reflec- 
tion to listeners must be kept at a small angle measured from the median plane 
(above 2 kHz). The large diameters are effective for the lower-frequency range, 
providing large angles of reflection from the median plane (Section 8.3). 

(3) The convex center canopy acts as a diffuser for avoiding strong reflection in 
the median plane, but providing enough energy to all the seats. 

(4) The spaces both above and below the ears play equally important roles in 
the binaural and spatial factor (IACC). When the floor is acoustically transparent, 
then the space under the floor is used to avoid the strong low-frequency attenuation 
(Section 8.5), and at the same time is used to control the IACC with diffusers placed 
under the floor. 

(5) The reversible reflectors with and without absorbing material on the side 
walls, on and near the stage, control the IACC as well as the F su b. 

(6) The room for parents with babies located at the side opposite the stage 
attenuates the long-path echoes along the circular wall in the hall. 

(7) The small reverberant spaces with different reverberation times, together 
with the digital reverberation machines, form a hybrid reverberator. This system, 
with the loudspeakers distributed near the ceiling and under the floor, is designed 
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(a) 


(b) 


Figure 10.1 1. Examples of the interaural cross-correlation function calculated at seats 2 
and 33 shown in Figure 10.10(a), with the data of Music Motif B. 


to add reverberation time to the room was designed as 0.5 s, according to the type 
of program sources. 

In a manner similar to the method of designing the concert hall with the mu- 
sic sources as well as the speech located on the stage, an electroacoustic system 
with a multiple channel of loudspeakers may also be designed. Taking the direc- 
tivity of the loudspeakers, properties of the delay machines, and the reverberators 
and reflection properties of the walls into consideration, we can calculate the im- 
pulse response at each seat (Takeuchi, Mori, and Ando, 1997). Hence the global 
subjective preference, according to the four acoustic factors, may be obtained. Par- 
ticularly, for public address systems, the range of effective duration of the ACF (r e ) 
of speech signals may be 20 ms to 30 ms. And, for music reproduction, the range of 
r e should be decided depending on the purpose of the acoustic space. Usually, the 
range of z e for music is much greater than that for speech. Thus, two sound systems 
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Photo 1 0.3. Computer graphic for the round-shaped event hall for the Fashion Plaza, Kobe, 
Japan. 


at least are recommended to be designed, for maximizing subjective preference. 
If the subjective preference of speech is maximized, speech clarity may also be 
increased by both of the temporal factors and spatial factors minimizing the IACC 
(Section 6.4). 

10.2.2. A Hall with Movable Stage Towers 

This example has been developed by ARTEC, New York, and illustrates how a 
large stage accomodating theatrical scenery may be transformed into a successful 
hall for musical performance. 

(1) A number of movable stage towers weighing about 1 ton each may arrange 
the shape of the stage enclosure and its size according to the music program 
source. The area of the stage floor may also be changed by two elevators which 
are the floors of the orchestra pit as shown in Figure 10.12. 

(2) The acoustical canopy above the stage is changed in height to provide back 
reflections for the performers (Figures 7.3 to 7.5) as well as the listeners 
(Figure 6.3). 

(3) The tilt ceilings above the audience ensure small values of the IACC for a seat 
far from the stage. 
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Figure 10.12. Cross-section of the El Pomar Great Hall, Pike’s Peak Center, Colorado 
Springs, USA (after Architectural Record, August 1984). The acoustic canopy above the 
stage and the movable stage towers are arranged according to the music program. 


10.2.3. A Hall with Variable Coupled Cubage 

As shown in Figure 10.13, Johnson, Kahle, and Essert (1997) proposed a method 
to control the decay characteristics of the reverberation time in concert halls, in- 
corporating partially coupled spaces and keeping a certain degree of clarity. This 
method includes both the variable couplings and the variability in design features, 
i.e., both the width and height of the hall and the strength of the early reflections. 
Thus, we can optimize the sound field to suit individual musical compositions. 
Figure 10.13(a) provides a narrow room with a low ceiling, and Figure 10.13(b) 
provides a room with a high ceiling with a width of 35 m. Examples of such 
reverberance chambers may be seen at the Festival Hall in Tampa, Florida, the 
Meyerson— McDermott Concert Hall in Dallas, the Crouse Hinds Concert Theatre 
in Syracuse, New York, and others. 

These controls may be performed by a “sound coordinator,” a specialist similar 
to a curator in the museum. The sound coordinator must understand the musical 
arts as well as acoustical science. 
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Figure 10.13. Control of the reverberation time and early reflections by change to the 
volume of the room (Johnson, Kahle, and Essert, 1997). 



11 

Acoustical Measurements of the 
Sound Fields in Rooms 


Following construction, acoustic measurements are made for the purpose of test- 
ing the acoustic factors which were calculated at the design stage of the halls. 
The accumulation and understanding of such data improves the calculation of the 
acoustic factors performed at the design stage. 


11.1. Binaural Impulse Response 

A diagnostic system of measuring the impulse response at the two-ear entrances, 
determining the acoustic factors, and further evaluations of subjective attributes of 
the sound field at each seat in a hall, is shown in Figure 1 1.1. A pseudo-random- 
binary signal is radiated from the loudspeaker to measure the impulse responses 
by two tiny microphones placed at the two-ear entrances of a real head (1.1 m 
above the floor). Then, the spatial factors of the right hemisphere specialization 
(LL and IACC) and the temporal factors of the left hemisphere specialization ( Atj 
and T sub ) are analyzed. When the effective duration of the ACF of the source signal 
(z e ) is analyzed, then the total scale value adding the scale values of the spatial 
factors g r (x), and of the temporal factors g/(x), referring to the most-preferred 
conditions, may be obtained. The value of (r*>) m j n is used to determine the most- 
preferred temporal values for [A/i] /? and [7^],, (Sections 3.1 and 4.3). The scale 
value of the subjective preference of the sound field is obtained after obtaining 
the measured physical factors. If the source signal is fed into an ACF-processor, 
then outputs numbered 1 through 4 may be used to control the sound field with an 
electroacoustic system simultaneously without any manual adjustment, preserving 
the preferred conditions of the four factors. 

The pseudo-random-binary signal mj is generated by a system with shift- 
registers as shown in Figure 1 1.2 (Davies, 1966). For example, if n = 4, k = 3, 
and the initial shift-registers are — 1, then the output sequence becomes 

mj = -1, -1,-1, -1, +1, +1, 4-1, -1, +1, +1, -1,-1, +1,-1, -hi 
U = 1 - 15 ). 


Y. Ando, Architectural Acoustics 
© Springer- Verlag New York, Inc. 1998 
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Figure 11.1. A system of measuring the four orthogonal factors and evaluating subjective 
qualities at each seat in a room. TS: test signal (maximum-length-sequence signal); IPR: 
impulse response analyzer; RH: right hemispheric factors (listening level and IACC); LH: 
left hemispheric factors (A t\, 7 sub , and the A -value in addition); CP: comparators with the 
most-preferred conditions based on the effective duration of ACF, z e ; ACF: autocorrelation- 
function processor, t c \ SIG: source signals; g, . (_r ) : scale values from the right hemispheric 
factors; gi(x): scale values from the left hemispheric factors; and S: total scale value of a 
certain subjective attribute. 



Figure 1 1.2. Generation of a maximum-length sequence (Davies, 1966). 
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me W- 



Figure 1 1 .3. A fast method of measuring the impulse response (Alrutz, 1981). 


This is repeated with a period of 15 binary digits. The largest possible period 
for the system is given by L = 2" - 1 (in this case, n = 4), and is called the 
maximum-length sequence. 

The input signal x(t) to the linear system h(t) under test is obtained by the 
sequence m j. As shown in Figure 1 1.3, the algorithm P T HP enables us to compute 
the impulse response with only a summation of the output data y(t) from the system 
(Alrutz, 1981). Since the Silvester-type Fladamard matrix H contains either 0 or 
1 , the computation is performed by adding operation without any multiplication 
operation. 

Examples of measuring the binaural impulse responses at a seat close to the 
stage (Seat a, left ear) and at a rear seat b (right ear) in the Kirishima International 
Concert Flail are demonstrated in Figure 11.4. In this measurement, an omni- 
directional-dodecahedron loudspeaker with 12 full range drivers was placed on 
the stage 1 .5 m above the floor for a sound source. It can be observed that the total 
amplitude of reflections, A, at seat a (close to the stage) is smaller than that at seat 
b (far from the stage). 


11.2. Reverberation Time 

After the impulse response is obtained, the reverberation time is measured by the 
Schroeder method (1965a, b). The integrated decay curve as a function of time 
may be obtained by squaring and integrating the impulse response of the sound 
field in a room, such that 

(s 2 (t)) = K f h 2 (x)dx, (11.1) 

Jt + T 

where the time T should be chosen sufficiently longer than the reverberation time. 

Examples of the decay curve measurement and the decay rate of both the left 
and right ears at seat a for the 500 Hz-octave band are shown in Figure 11.5. The 
reverberation times measured are both 2.07 s. The measured reverberation times 
with octave band filters in the Kirishima International Concert Hall (without an 
audience) are plotted as full circles as shown in Figure 1 1 .6. The empty circles are 
estimated values of the reverberation time for a full audience. 
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Figure 1 1 .4. Impulse responses measured at seats a (near to the stage) and b (far from the 
stage) in the Kirishima International Concert Flail. The amplitude of the impulse response 
measured at seat a is attenuated. 


It is worth noticing that Jordan ( 1 969) showed that the values of the early decay 
time (EDT) measured over the first 10 dB of decay are close to the values of the 
reverberation time averaged with the interval of — 5 dB to — 35 dB. 

The total amplitude of reflection A, defined by Equation (4.6), is obtained as its 
square 


A 1 = 


fx dx 

/ f ° h 2 (x ) dx 


( 11 . 2 ) 


where s signifies a small delay time just large enough to cover the duration of the 
direct sound. 
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Figure 1 1.5. Integrated decay curves obtained from the impulse responses at the two ear 
entrances, at seat a in the Kirishima International Concert Hall. 



Octave-band center frequency [kHz] 


Figure 1 1.6. Reverberation time measured in the Kirishima International Concert Hall. 
(•): measured values without audience; and (o): estimated with full audience. 
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11.3. Measurement of Acoustic Factors at Each Seat in a 
Concert Hall 

We discussed in Chapter 9 the seat selection system designed for the purpose 
of enhancing individual satisfaction. To begin with, four orthogonal factors are 
measured at each seat in a concert hall (Ando, Sato, Nakajima, and Saku- 
rai, 1997; Nakajima and Ando, 1997; see also Sakurai, Aizawa, and Ando, 
1998). 

Measured values of the listening level (LL), the total amplitude of reflection (A), 
the initial time delay gap (At\) between the direct sound and the first reflection, 
excluding the reflection from the floor, and the IACC at each seat are shown in 
Figure 1 1.7. The reverberation times at all the seats had almost the same value, 
about 2.05 s for the 500 Hz band. 

Even though the final scheme of the concert hall was changed in width of the 
hall (1 m larger) from the scheme at the design stage, the values of each physical 
factor measured, as shown in Figure 11.7, are not much different from the values 
calculated in Figure 10.4. 

11.4. Recommended Method for the IACC Measurement 

The following two methods are recommended for measuring the IACC, as needed 
for subjective evaluations or for the sound field tests after construction of a 
room: 

(1) In order to evaluate the subjective responses of the sound field, the interau- 
ral crosscorrelation measurement (with all values of the IACC, tiacc an d Wiacc 
defined in Figure 3.7) is recommended to be performed. These measurements 
are performed after passage through the A-weighting network with the music or 
speech signal, under identical conditions with subjective judgments at each seat 
position in an existing hall or at the listening seat of a simulated sound field in the 
laboratory. If a loudspeaker is used as a source signal in the room under test, then 
the same characteristic source signal must be used both in the measurement of 
the physical acoustic factors and in the subjective tests. Since the interaural cross- 
correlation function is a spatial factor, it is recommended that measurements be 
made for all of the direct sound and reflections without any temporal subdivisions. 
It is worth emphasizing that the preferred condition is tiacc = 0. as described in 
Section 4.5. 

(2) In order to compare the sound field in the existing hall after construction 
with that calculated at the design stage, measurements of each octave-band are 
performed, together with the other factors. At ] , T su b, A and LL. This may include 
the measurements of the IACC E (Hidaka, Beranek, and Okano, 1995), which is 
defined by 0 ms to 80 ms of the integration interval for the direct sound and early 
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reflections, as well as the I ACC, which is defined by 0 ms to — oc ms for the direct 
sound and all reflections and reverberation. A typical example of measuring the 
I ACC as a function of the integration interval, which was performed in Symphony 
Hall, Boston, is shown in Figure 1 1.8. It is remarkable that a close relationship is 
found between the values of the IACCe at 80 ms and the IACC (at 3000 ms) as 
shown in Figure 1 1.9. 

When the space is used for performing a dance, ice skating, or a party, then 
the listeners are facing various directions. Then values of the IACC and tiacc are 
measured as a function of the direction of the head. The measured results with 
the 500 Hz octave-band noise in an oblong atrium of a hotel at a distance 10 m 
from the source position are demonstrated in Figure 11.10. When the listener is 
facing the sound source, then IACC = 0.41, and tiacc = 0, and thus no image 
shift occurs. These values are nearly unchanged for the head directional angles 
less than 30 c . When the listener is facing the lateral side at 90°, then the IACC 
became greater than 0.50, and ri A cc is about 600 /zs, due to the “lateral sound 
source.” 

If we are interested in the “apparent source width (ASW),” then the calculation 
and measurement of Wiacc of the sound field are recommended. (The calculation 
may be carried out by Equation (3.25), with the value of 8 defined as shown in 
Figure 3.7.) 



Figure 1 1.8. Measured IACC as a function of the integration interval, 27\ of the impulse 
responses, for each octave-band range (Hidaka, Beranek, and Okano, 1995). 



1 1 .4. Recommended Method for the IACC Measurement 203 



0 0.2 0.4 0.6 0.8 1.0 

IACC 


Figure 1 1 .9. The relationship between the values of the IACC E and IACC (Hidaka, Personal 
communication, 1996). 



Figure 11.10. Measured IACC and T\ AC c as a function of the head direction to the sound 
source, in a hotel atrium (Kobayashi et al., 1997). 
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11.5. Physical Properties of a Forest 
as an Acoustic Space 

It is believed that birds singing in the forest was the origin of music, and the forest 
was probably the first performance space of vocal music during a walk. The sound 
field in the forest consists of multiple scattering phenomena, and it is complicated 
to calculate the impulse response even by the use of a computer. Thus, the impulse- 
response measurement was made by reproducing the pseudo-random-binary signal 
of 2.7 s with the clock frequency of 48 kHz. The knowledge obtained here may be 
utilized to understand music performance in the forest, and further to design the 
sound field in rooms with a large number of columns. 

In order to investigate the effects of multiple scattering by trees, acoustic factors 
were measured on a 5 m wide path of asphalt in a forest as is shown in Photo 
11.1 and Figure 11.11 (Sakai, Sato, and Ando, 1996; 1998). The sketch shows 
that, within a part of the forest with an area of 60 m x 45 m, there were trees 
of various diameters between about 0.3 m and 1.0 m; the average height of the 
trees was about 18.5 m. The omnidirectional-dodecahedron loudspeaker with 12 
full-range drivers was placed 1 .5 m above the path, and the pseudo-random-binary 
signal was radiated from a loudspeaker similar to that in the previous section. The 
impulse responses obtained at the two-ear entrances with a real head (20 m from the 
source position, 1.6 m in height) are shown in Figure 11.12. The strong reflection 
from the asphalt surface at about 1 ms is observed, but no clear initial time-delay 
gap between the direct sound and the first reflection from the trees is identified. 
Thus, three of the four factors were measured here. The integrated decay curve 
obtained from the impulse response at the left ear entrance is demonstrated in 
Figure 1 1 .12(c). The logarithmic decay may be found for the initial decay range of 
about 1 5 dB, just after the strong direct sound and the asphalt reflection. The typical 
multiple scattering effects may be observed in the curve after 0.4 s departing from 
the logarithmic decay. This decay curve greatly differs from the curves shown in 
Figure 11.5 which were measured in the room. 

The sound-pressure levels for each octave-band signal as a function of the dis- 
tance relative to the pressure level at 5 m are shown in Figure 1 1 . 1 3. It is of interest 
to note that the level of the 125 Hz band signal was higher than the inverse square 
law at 10 m to 20 m, due to the reflection of the asphalt surface, while the sound- 
pressure level for other frequency bands decreased more than the law, due to the 
long path multiple scattered reflections and absorption by the trees. 

The reverberation times as a parameter of distance are shown in Figure 11.14. 
The remarkable findings here are that: 

(1) Up to 5 m from the source, the reverberation time is shorter than 0.7 s. The 
longest reverberation time is observed at 500 Hz. 

(2) The reverberation time increased with the increasing distance from the source, 
particularly for values in the middle frequency range of 500 Hz where it rose 
to about 1 .8 s at a distance of 40 m. 




Photo 11.1. A forest investigated for its acoustic factors. 
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Figure 11.11. A forest where the orthogonal acoustic factors were measured. The diameters 
of trees distributed are roughly classified by 0.3 m, 0.6 m, and 1 .0 m. 


(3) The reverberation time in the lower-frequency range (below 250 Hz) is shorter 
than 1.3 s. However, the listening level is relatively higher than the higher- 
frequency range creating loudness balance. 

The measured I ACC are shown in Figure 1 1 . 1 5. For the frequency range above 
1 kHz, the value of the IACC rapidly decreased with distance. The IACC measured 
for such a frequency range at a distance of 40 m was less than 0.5. 

Referring to these measured factors (except for the initial time-delay gap), it is 
concluded that the sound field in the forest is an excellent acoustic condition for 
sources such as flutes and string music, and vocal music as well as for bird singing 
which contain the middle frequency components between 500 Hz and 2 kHz. This 
is most true, particularly at a distance about 40 m from the source location. 
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Figure 11.12. Binaural impulse responses measured at 20 m from the sound source, (a) Left 
ear entrance; (b) right ear entrance; and (c) integrated decay curve at the left ear entrance. 
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Figure 11.13. Relative LL measured as a function of the distance, (o): 125 Hz; (A): 250 
Hz; (•): 500 Hz; (□): 1 kHz; (■): 2 kHz; and (O): 4 kHz. 



Octave-band center frequency [kHz] 


Figure 11.14. Measured reverberation time as a parameter of the distance from the sound 
source. (O): 5 m; (•): 10 m; (A): 20 m; and (□): 40 m. 
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Figure 11.15. Measured IACC as a parameter of the distance from the sound source 
(T| AC c = 0). (O): 5 m; (•): 10 m; (A): 20 m; and (□): 40 m. 
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Generalization to Physical 
Environmental Planning Theory 


We have evolved in a universe which can be measured by using the dimensions of 
time and space. Consequently, the dimensions of the “inner universe” — an individ- 
ual’s subjective model of the universe — must correspond in some way to those of 
the physical universe. The brain receives environmental stimuli not only from the 
hearing system, but also from visual, thermal, and other senses, to some extent or 
another. These are associated with an individual ’s total subjective and physiological 
evaluation of any physical space and more particularly in a concert hall. 

This chapter suggests a foundation for the theory of planning physical environ- 
ments, which takes account of time and space as they are specialized in human 
cerebral hemispheres. A representative example of the application of this con- 
cept of environmental planning has been discussed in the chapters on concert hall 
acoustics: the sound field in a hall can be altered with careful manipulation of 
four orthogonal factors. These variables comprise two temporal factors of the left- 
hemisphere dominance, and two factors of the right hemisphere dominance which 
are related to spatial attributes. The design of a specific concert hall can be altered to 
suit specific types of music, such as chamber music or choral works. This concept 
can be partially found in the architecture of Tadao Ando (Frampton, 1985; 1987), 
the traditional tea houses in Japan, and the urban design theory of Lynch (1972). 

The physical variables are specific, measurable factors, such as, for example, 
the temperature and lighting level, that influence a person’s perception of the envi- 
ronment. If these variables are identified, and the interrelationships and influences 
on human perception explored, a method can be evoked that has application in the 
field of physical environmental design, including architecture and urban design. 


12.1. A Generalized Theory of Designing Physical 
Environments 

12.1.1. Hemispheric Specialization 

It has been noted that the two hemispheres of the brain have different areas of 
specialization and ability. As discussed in Chapter 5, subjective preference about 


210 


Y. Ando, Architectural Acoustics 
© Springer- Verlag New York, Inc. 1998 



12.1. A Generalized Theory of Designing Physical Environments 2 1 1 


time-factored experience for the sound field takes place in the left hemisphere, 
and the spatial-factored experience in the right hemisphere (Ando, Kang, and 
Morita, 1987; Ando, 1992; Ando and Chen 1996; Chen and Ando, 1996; Nishio 
and Ando, 1996; Chen, Ryugo, and Ando, 1997). The right hemisphere tends to 
perceive space in multiple-dimensional, nontemporal, and nonverbal terms for both 
visual and auditory environments. In most subjects, the left cerebral hemisphere 
is normally concerned with linear, sequential modes of thinking, such as speech 
and calculation (Sperry, 1974; Davis and Wada, 1974; Galin and Ellis, 1975; Levy 
and Trevarthen, 1976; Ando and Kang, 1987). 

Clearly, most experiences in daily life involve a combination of these two modes 
of thought, with instantaneous dominance of the left and right switching con- 
tinually; and both hemispheres often cogitating simultaneously. It is similar to 
photographs or portraits of being a two-dimensional representation of reality. 
When the time function is added to a projection of a sequence of these images, the 
static image “springs to life.” 

Imaging increasing levels of complexity, we have built up a list of factors (Table 
12.1). These account for the physical orthogonal factors and influence everyday 
human experience. An ideal value for each factor, or a range of preferred values, can 


Table 12.1. Proposed physical environmental factors to be planned. 


Physical environment 

Spatial factor 

Temporal factor 

Acoustic 

(1) Listening level, LL 

(2) Initial time delay gap 
between the direct sound 
and the first reflection, At \ . 


(4) Interaural cross 
correlation, IACC 

(3) Subsequent reverberation 
time, Tsub 

Visual 

(1) Lighting level 

(2) Properties of movement 
function of reflective 
surface, T 


(3) Properties of the 
reflecting surface 

(4) Spatial perception 
including distance 
factor 


Thermal 

(1) Spatial sensation 
of body 

(2) Relative humidity 

(3) Temperature 

(4) Radiant Heat 

(5) Air movement (velocity) 


Note: In addition to the above variables, the characteristics of sources, the location change 
of source, and the observer’s activity should also be taken into consideration. 
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be found by systematic research utilizing the theory of subjective preference with a 
number of subjects. A matrix can thus be assembled that describes the ideal values 
and the mutual influence or independence of each. Such a planning tool could be 
utilized during the design phase of a project to predict whether humans will find 
a given physical environment pleasurable, before it is built. What such a planning 
tool would add to the practice of environmental design is a method that explicitly 
recognizes the needs of the specialization of the human brain. Specifically, a design 
method is proposed that appreciates the sensitivity of the left hemisphere to the 
dimension of time in the environment, in addition to the common consideration of 
three dimensions of space. 

There have been notable successes in the attempt to introduce consideration 
of a time element, specifically in terms of the sequential nature of movement 
through space, into the process of design. What distinguishes the present approach 
from previous ones is its attempt to describe an environment in terms that include 
the passage of time together with a spatial factor. Previous studies have focused 
primarily on an individual’s movement through space (Figure 12.1). 


12.1.2. Designing Physical Environments 

A general theory for evaluating subjective attributes is described, based on the 
theory of subjective preference for a sound field (Section 4.4). 

Let x u , n = 1, 2, . . . , 7, be the significant physical factors of I dimensions 
acting on a human; then the scale value of subjective preference, or another single 



Figure 12.1. Sequential versus temporal experience of the environment, (a) Movement 
through space previously considered; and (b) experience of the passage of time while 
stationary. 
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subjective response, is given by 

S = g(x) 

= g(x\,x 2 , Xj). (12.1) 

Next, we consider the fact that the function of the human cerebral hemispheres can 
be divided into two main categories: spatial factors which are associated with the 
right hemisphere, and temporal factors which are associated with the left. Thus, we 
can divide the physical factors defining human perception of the environment into 
two, so that Equation (12.1) may be reduced to (Ando, Johnson, and Bosworth, 
1996) 

S = gl(x) + gr(x) 

= gi(xi 1 , */ 2 , . . . , Xim) 4- gr(x r u x r2 , - - - , *,-/v), I = M + N. (12.2) 

Furthermore, considering the fact (as mentioned in Section 9.2) that there are 
few effects of lighting level on the preferred initial time-delay gap between the 
direct sound and the first reflection, minor interference is likely observed between 
different physical factors such as between the visual factor and the auditory factor. 
For instance, as described in Sections 5.2 and 5.3, there is little interference since 
the initial time-delay gap is associated with the left hemisphere and the lighting 
level is assumed to be the right-hemisphere dominance. 

It is assumed that each physical environment has an independent influence on 
the subjective attributes from other physical environments, so that 

S = [gl(x) + g r (x)] auditory + [glW + grU)]visual + [glW + g r (x)] thermal 

4" [<?/(-V) “1“ g r OOjother human physical environments- (12.3) 

In particular, this holds, at least in the neighborhood of the optimal conditions 
for each physical factor, avoiding an extreme physical environment, for example, 
listening to music in a temperature below 0° C. 

If physical environments other than the sound field are fixed at or near the 
optimal conditions, then 

S = [gl(x) + grCOjauditory + G 7=4. (12.4) 

This is the same formula as discussed in Section 4.4. Without loss of information 
in this expression, due to the fact that the scale is a relative one, we can put the 
constant at c — 0. 

Since there is a lack of data calculating all of the effects in terms of the scale 
value, the last three terms in Equation (12.3) are, as yet, unavailable. However, 
we can be aware of the significant factors in designing physical environments. In 
order to blend human and physical environments, both the spatial and temporal 
factors must be taken into account, similar to the theory described in concert hall 
acoustics, blending listeners and the sound field together with the musical tempo. 

In architectural planning, we are likely to forget the temporal factors which must 
be included in the plan and cross-sections of the building. Figure 12.2 indicates 
discrete temporal periods of the human and physical environments which have to 
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Figure 12.2. Blending human biological rhythms and periods of natural physical 
environments. 
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be blended. Remarkably, there are certain significant periodic eigenvalues, in both 
human biological rhythm and physical environmental activities in the time domain, 
to be blended. Therefore, we do not need to consider every continuous-time period 
with the possible infinite-real number. 

It is worth noticing that there are many more unconscious physiological rhythms 
than subjective psychological attributes, associated with the physical environmen- 
tal activities of long periods. For example, the effects of environmental noise are 
described in terms of the development of unborn babies and the development of 
children over long period accumulations: periods of reproduction and generation 
(Ando and Hattori, 1973, Ando, Nakane, and Egawa, 1975, Ando and Hattori, 
1977a, b; Ando, 1988). The following section discusses the significant factors 
for each physical environment for the conscious perception of the psychological 
present. 


12.1.3. Proposed Factors for Physical Environments 

In the search for a broader approach to design, the same approach used to ana- 
lyze acoustic performance has been applied to other realms of human experience, 
namely, the visual and thermal. 
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(A) Visual 

In a manner similar to acoustics, human perception of the visual environment can 
be expressed by using Equation (3.14), which describes a visual image at certain 
points of the left and right retina 

fiAt\ Ro) = lPn(t\ Ro) * A„w„(t , T) * h nLr (t)}, (12.5) 

in which p n (t\ Ro) signifies the characteristics of a light source located at Ro = 
Uo, >’o, Zo); n is an integer indicating each ray of light; A n is the distance atten- 
uation according to the inverse law; w n (t) signifies the properties of the surface 
from which light is being reflected; T is the time representing the movement of the 
surface; and h n (t) is the physical characteristic of the “spatial perception system” 
including the eyes and face of an observer located at R — (x, y, z), { } designating 
a set. 

If we rewrite the left side of Equation (12.5) to take account of the movement 
of the observer, it becomes 

(r; R(t)\R 0 ) , (12.6) 

where R(t) signifies the observer’s movement as a function of time t. This equation, 
therefore, has both spatial and temporal constituents. There is the spatial relation- 
ship of the light source and the observer; and a time dimension in the movement 
of the observer through space. Thus the visual environment can be described using 
the following factors, other than light source characteristics, and the movement 
factor R(t) of the observation: 

(1) lighting level; 

(2) the time factor T of a reflector; 

(3) properties of the reflecting surface, w n (t), which include color and form (edge); 
and 

(4) properties of the spatial perception, h nLr {t) and the distance factor A„; 

(B) Thermal 

Sensation of the thermal environment can also be described using the following 
factors: 

(1) spatial distribution sensed by the whole body; 

(2) relative humidity; 

(3) temperature; 

(4) radiant heat; and 

(5) air movement (velocity). 

Common sense indicates, however, that thermal comfort is not a static function. 
For example, after being outdoors on a cold day, a person entering a department 
store experiences a temporary sensation that the store is “too hot,” when, in fact, the 
temperature is at a normally comfortable level (Kuno, Ohno, and Nakahara, 1 987). 
The thermal conditions that a person finds comfortable for riding an exercycle may 
be too chilly for sitting quietly and reading. Thermal comfort is thus conditioned by 
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activity and the boundary conditions encountered. Factors for time and movement 
must be introduced. 

Considered in this way, each environment’s thermal suitability can be adjusted 
to the activities taking place therein, in a manner similar to that discussed for 
architectural acoustics. A kitchen, for example, should have a cooler thermal en- 
vironment than a library, because the kitchen has a higher level of human activity. 
Table 12.1 indicates suggested environmental variables. These factors are limited 
to those that have the possibility of being altered by designers. 


12.2. Examples of Physical Environmental Planning 

12.2.1. Discrete Periods of Environment and Human Life 

As shown in Figure 12.2, the crucial factor in the temporal dimension of the 
environment is the periodic cycle. Every aspect of the passage of time is bound up 
with periods: for example, the shortest period (about 0.5 s to 5 s) corresponding to 
the psychological present is related to brain wave, pulsation, and breathing, which 
are associated with the perception of music, the glitter of leaves, and the ripple of 
a water surface. The rapid eye movement (REM) of about 70 to 150 per minute, 
related to a basic rest-activity cycle of 10 to 20 per day (Othmer, Hayden, and 
Segelbaum, 1969; Kripke, 1972), is associated with, for example, one session of a 
concert, a lecture, and work. The circadian rhythm deeply connected by sunlight 
with the Earth’s rotation period is associated with daily human activity. The week 
created by the social law for work and leisure is associated with, for example, the 
planning period of concert and drama, or a social activity. The next distinguishable 
period is concerned with the movement of the Moon. The revolution of the Earth 
around the Sun (changing of the seasons) is associated with, for example, the color 
change of leaves and annual festivals. The black spots on the Sun which appear 
once in about every 1 1 years may influence, more or less, the environments on 
Earth. Such a periodic space weather, including the magnetic storm may cause 
effects on human life (Roederer, 1995). The alternation of generations of about 
30 years and the span of life of, say, about 90 years, may be considered in the 
planning of houses, in accordance with the individual schedule of life. 

The present theory suggests that these discrete periods should be explicitly 
recognized during the design process for any human environment. The passage 
of time in the designed environment should be as consciously considered as the 
three-dimensional organization of the space itself. The following are examples of 
designing physical environments related to a concert hall. 


12.2.2. Physical Environments for a Concert Hall 

(A) Approach 

The approach passage to a concert hall is designed for both temporal and spatial 
factors, so that our senses will gradually be excited in anticipation of the music. 
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In the design of lane, for example, trees, soil, water, fire, Sun, Moon, and stars 
help to produce music in our mind escaping from our daily life to the concert. 
Direct ways of producing sound are birds singing in trees, leaves in a breeze, and a 
waterfall. Indirect ways of producing music in our mind are due to quiet elements, 
stars, Moon, Sun, a skyline of mountains. Thus, just before the concert begins, a 
well-designed sound signal should be reproduced to inform audience to be seated. 

(B) Visual Environment 

In order to design a good visual environment, both for the musicians and listeners 
in a concert hall, there are several factors to be controlled, as indicated in Table 
12.1. Relating to the lighting level, Claus Ocker (1995) as a singer suggests that 
nonverbal communication between the musician and the listener in a concert hall, 
as shown in Figure 12.3, is full of emotion and tension on both sides. Communi- 
cation starts at the moment when the door to the stage is opened. The performer 
would like to see the faces of the audience and to make contact with them during 
the performance, in order to recognize the audience’s mood and to feel its intensity. 
Thus, more light in the hall than is customary is recommended for seeing the faces 
of the listeners seated in front of the musician. Ocker wonders about planed seats 
behind the stage, such as a “vineyard” as in the Berliner Philhamonie, because 
singers have the feeling that the sound of their voices is given in front of them, 
and not backward. An important part of artistic interpretation is that, in singing in 
an expressive way, a facial mimic can sometimes be seen in accordance with the 
sound of the voice coming from the deepest universe of soul or heart. This is an 
example of a special kind of communication between artists and audience. 

Evidence that interaction between auditory information and visual information 
in performing percussion increased communication between the performers and 
listeners, as reported by Ohgushi and Sakuma (1997). For the visual environment, 
therefore, it is reconfirmed that the lighting level be about 35 lx is desirable, as 
was discussed in Section 9.2. 

(C) Thermal Environment 

The thermal environment is also important. Music performances, with low-level 
body activity and lighting on stage, need to maintain certain values of temperature 
and humidity. But many musicians are sensitive to unwanted fresh air from above 


Nonverbal communications 



Figure 12.3. Nonverbal communications between music performer and listener. 
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onto the stage, moving musical note pages that have been turned or are in the 
process of being turned. 

In being seated for a long time the knees, legs, and feet of the listeners are subject 
to become numb. This is a typical example of the spatial sensation controlling the 
distribution of temperature and air movement. 

(D) Noise 

It is well known that any possible noise must be kept to very low levels, hope- 
fully, less than NC-15 (Beranek, 1957). Even the weak sound-pressure level of the 
electronic “peep-peep” sound of wristwatches or from portable telephones may 
suddenly interrupt the full concentration of the singer (Ocker, 1995). Continu- 
ous low-level sound from ventilation systems is likely to be ignored without any 
disruption, because of the right-hemispheric dominance, but the intermittent tonal 
peep-peep sounds influence our brain more critically than continuous noise because 
of additional information interference effects with music in the left hemisphere. 


12.2.3 . Play Area for Children and Residential Design 

(A) Play Areas 

To take another simple example of the application of this theory, it is easy to see 
why the play-space designs of recent years hold so much more interest than the 
“playsets.” Wood structures, flexible bridges, and fanciful designs allow for much 
freer play of the imagination than the hard surfaces and steel frames of playground 
toys. It is also easy to see why natural areas such as small streams are so fascinating 
to children. The stream is an interactive “toy,” changing course and speed as the 
child performs small-scale engineering projects, blending the children and water 
in the flow of time. The “toy” itself changes, due to the child’s actions, and indeed 
continues to change even after the child stands back to admire his or her work. 
Other environmental elements include trees, soil, and flame, as well as celestial 
bodies such as the Sun, the stars, and Moon. 

(B) Residential Design 

Architectural form can be modified so that the natural periods of day and night, 
as well as clouds, sun, rain, and the seasons are embraced and made part of the 
experience of the enclosure (Ando, Johnson, and Bosworth, 1996). Architecture 
can blend and fuse natural cycles with human lives so that it becomes impossible 
to separate the two (Bosworth, 1997). Memory of a well-designed building bears 
with it, inextricably, a memory of spring sunlight as it shines across a white wall, 
or the sound of rain falling on the pavement outside. 

There are three specific architectural devices that help bring these natural cycles 
of light and dark into a building. The first, looking like a proscenium of the audito- 
rium for Nature’s performance, is the window. The most obvious window type is 
a view window, with its top above and its sill below eye level. This allows people 
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inside the building to see directly the changes in Nature taking place outside. The 
second type of window is a transom, located just above the human zone. This type 
brings light into the building and allows the inhabitants to see sky changes and 
other environmental phenomena, such as trees waving in the wind. The third win- 
dow type is the monitor. Monitor windows are located above the habitable zone. 
Their use is more for bringing light in than for seeing it out. They allow a ceiling 
to be bathed in sunlight, or a rectangle of light to move across a wall during the 
course of the day. 

The second architectural device is the courtyard (as an acoustic space, it is 
described in Section 7.3). It provides a closed view from within the interior space. 
This allows people to see natural periods of light and weather in a bounded portion 
of Nature with the architecture beyond. 

The third type of architectural device is a garden that can be viewed from inside 
a building. It allows a controlled view of a portion of Nature with natural features 
visible beyond. 

Architecture can intensify the sense of the passage of time by carefully using 
the devices. For example, light from a small window is magnified if the window 
is perpendicular to an adjacent wall. As the Sun moves, a ray of light plays across 
the wall like a sundial. As clouds pass, the room goes from light to dark and back 
again. 

Interior surfaces can be modeled in the following ways: 

( 1 ) windows can be placed next to the ceiling, wall, and floor surfaces to reflect 
light; and 

(2) the surfaces adjacent to the windows, transoms, or monitors can be splayed. 

The effect is to magnify the light coming through the window. The same effect 
can be achieved with monitors next to a sloping ceiling. It is worth noting that this 
kind of system with a sloping ceiling creates a superior sound field, decreasing the 
IACC as is mentioned in Sections 8.1 and 8.2. 

Another very pleasant lighting effect with seasonal variations can be achieved 
by orienting windows toward deciduous trees. A gable window, for example, will 
allow the trees to filter and reflect the incoming light. A similar effect comes from 
sunlight playing on water-reflecting onto a ceiling from the ocean, for example. 
Even while listening to music, these examples can be thought of as an invita- 
tion to Nature to enter and participate in the formation and experience of interior 
space. 

For the purposes of blending physical environments and the human brain, the 
basic design theory has been developed incorporating temporal and spatial val- 
ues. This concept was first developed scientifically for concert hall design as 
described in the previous chapters. The temporal and spatial factors strongly in- 
fluence any subjective attributes and must be considered in the design of any 
physical environment. In the temporal design of an environment, for example, tem- 
poral rhythms in physical environments have to be blended with human biological 
rhythms. 
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So far, we have discussed that a meeting place of art and science may help 
discover individual preference or personality as the minimum unit of society. A 
lasting peace on earth may be achieved by release of each personality given by 
Nature. 



Appendix I 

Method of Factor Analysis 


The method which is applied in the multiple-dimensional-factor analysis of Section 
4.5 is briefly described here (Hayashi, 1952; Hayashi 1954a, b). We give the 
numerical values to each subcategory of each item and synthesize the responses 
as we are concerned with behavior patterns. 

In this analysis, all items do not need to be scalable. Use the data of n cases. Let 
A be an outside variable and define s and k as s = 1,2,... ,/?(/? is the number 
of items), and k — 1,2,... , K s ( K s is the number of subcategories in the sth 
item), respectively. Since each case checks only one subcategory in each item, the 
behavior pattern of the /-case is to be synthesized in the form of 

r r ( k, j 

X *<i ) = £ I] 8 i ( s *) X *k > ( A. 1 ) 

.9 = 1 . 9=1 [k=\ J 

where 

k, 

T. Si (sk) — 1 

k= 1 

and 

8j (sk) = 1 if the /-case comes under the A:th subcategory in the sth item, 
8j(sk) = 0 otherwise. 

a,-, which is called the total score of the /-case, has a numerical value, since X s ^ 
has a numerical value. 

The correlation coefficient p between A and o?/ is written as follows: 


P(A, at) = 


U/«) ELi(^/ - A)(ai - a) 


where 



^ - v 2 ’ 

n /= i 
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a 




^2 (O'; - a) 2 . 
1=1 


In order to obtain a maximum value, p, or to estimate the outside variable from 
the behavior pattern, put A = 0 and a — 0, because p is invariant under a shift of 
origin. The score of each subcategory can be determined by solving 


d \ 

— ) = 0 (s = 1, 2, . . . , R: k = 1, 2, . . . , K,). (A.3) 

dX sk ) 



Appendix II 

Design of Electroacoustic Systems 


II.l. The IACC of a Two-Channel-Loudspeaker- 
Reproduction System 

The IACC of multiple-channel-loudspeaker-reproduction systems has been dis- 
cussed for the case of wide-band noise of 250 Hz to 2 kHz (Damaske and Ando, 
1972). Since a two-loudspeaker-reproduction system is the fundamental one that 
is often used, the optimal two loudspeaker directions for music signals and noise 
are discussed here, in which the IACC is minimized in a listening room (Ando, 
1978). 

Figure AIL 1 demonstrates the calculated values of the IACC by Equation (3.25) 
for the symmetric loudspeaker system as a function of the horizontal angles |§ | (rj = 
0°), for example, with Music Motif A. The geometric loudspeaker arrangement is 
illustrated in the upper part of this figure. In this calculation, data of the measured 
interaural cross-correlation for a single directional sound were used (Ando, 1985). 
Optimal horizontal angles showing minima of the IACC (Ando, 1978) may be 
found at around: 


24°, 60°, 135°, and 150°. 

These optimal angles for several other music motifs and for wide-band noise are 
listed in Table All. 1 . Clearly, the common angles showing significant minima for 
all of the sound sources may be observed near two angles 

|£| ^ 26° and 151°. 

In order to realize a smaller value of the IACC in an actual sound field, additional 
loudspeakers and/or reflectors in the listening room can be taken into consideration. 


II.2. A System of Controlling Temporal Factors 

In order to obtain the preferred condition of the temporal criteria for the existing 
sound field in a room, a real-time-control system, involving both the initial time- 
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Angle l£l 


Figure AIL 1 . The value of the I ACC for the two-channel-loudspeaker-reproduction sys- 
tem, as a function of the horizontal angle |£| (77 = 0°) to the listener. The correlation values 
needed for calculation of the IACC are listed in Tables D.l and D.2 of the reference (Ando, 
1985). 


Table AIL 1 . Horizontal angles to a listener indicating the IACC minima 
for the standard two-channel loudspeaker reproduction system. 


Sound 

source 

Horizontal angle 

1 H 1 in = o°) 


[Degrees] 


Music A 

— 

24 

— 

60 

135 

150 

— 

Music B 

12 

27 

45 

67 

133 

152 

166 

Music C 

— 

27 

45 

63 

135 

151 

— 

Music D 

12 

27 


63 

130 

151 

— 

Noise* 

— 

26 


63 

126 

151 

— 

Total 

— 

26 ± 2 

— 

63 ± 3 

132 ± 4 151 zb 1 

— 


Bandpass filtered noise, 0.25—2.0 kHz (Ando, 1985). 
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Initial Setting 

Figure AII.2. A real-time system for controlling temporal factors. 


delay gap and the subsequent reverberation time may be designed. Control is 
achieved by calculating the values of r e of the source signal, so that the preferred 
delay time and reverberation may be adjusted automatically, as shown in Figure 
AII.2. In this operation, a fast ACF-calculation is needed to produce the initial 
time-delay gap between the direct sound and the first reflection and the subsequent 
reverberation. The calculation of the short-time moving ACF may be performed 
by adding only operations, for example, converting the direct sound into the three- 
value signals of 1, 0, and —1, such that, 


c[p(t)] = 


1, pit) > AA, 

0, p(t) < A A, 
-1, p(t) < -AA, 


(A. 4) 


where A A is a threshold value to be set smaller than the maximum value of p(t) 
and greater than the residual noise level of the acoustic system. The initial setting 
is made from the program source (Figure 7.7). 



Appendix III 

Time- Variant Sound Fields: 
Variable-Delay Time of a Single 
Reflection 


The statistical results of the measured sound-pressure level of pure tones (Ueda 
and Ando, 1997) suggest that the fluctuation of the sound-pressure level is caused 
by a change in the time delay of reflections due to fluctuation in the path differ- 
ence. Strictly speaking, any sound field is more or less a time-variant system. This 
phenomenon is particularly significant in a large space. In order to understand the 
effects of such a fluctuation on subjective attributes, an initial investigation of the 
just noticeable difference (JND) in a variable delay time of a single reflection is 
discussed here (Ueda, Furuichi, and Ando, 1997). The final goal of this investiga- 
tion is to find more preferred conditions in a time-variant sound field than those in 
the time-invariant sound field. The experiments were conducted at the preferred 
time delays of a single reflection which are mentioned in Section 4.1, 120 ms and 
40 ms with Music Motifs A and B, respectively. 

The fluctuation interval of the delay time A was modulated by a sinusoidal 
signal with frequencies of 0.2 Hz, 0.4 Hz, 0.8 Hz, and 1 .6 Hz for Music Motif A 
and 0.4 Hz, 0.8 Hz, 1.6 Hz, and 3.2 Hz for Music Motif B. The amplitude of the 
single reflection is fixed to be the same as that of the direct sound. 

Results of the JND of A as a function of its modulation frequency, M/, are 
shown in Figure AIII.l. The JND is increased by decreasing the modulation fre- 
quency for both Music Motifs A and B. Also, the same is true for the standard 
deviation of the JND. In other words, when the modulation frequency of the sys- 
tem is low enough, then the fluctuation of the delay time A becomes inaudible. 
This tendency is much more significant when the music has a fast tempo such as 
Music Motif B (r e — 43 ms). For slow-tempo music, such as Music Motif A ( r e 
is 127 ms), the system is sensitive to rapid changes (high value of M/), and the 
JND of A becomes small. An approximate and tentative formula for the JND of 
A was obtained within the range of the experimental condition, and is given by 


A (JND) ^ 


AT 

log z e -F k" log Mf 


(A. 5) 


where k' = 7.0 and k " = 2.5. The threshold value of A(JND), without noticing 
the difference, for example, is that if the value of x e becomes small, then Mf may 
have a high value. 
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Figure AIII.l. 


It is assumed that the subjective preference of the time-variant sound field 
is improved by adjusting the values of [At\] p and [T sub ] /? are defined for the 
time-invariant sound field. Initial preference judgments for sound fields with the 
modulated delay time of the single reflection were conducted (Atagi, Ando, and 
Ueda, 1998). The modulation frequency Mf was set at 0.1 Hz, a hardly sensitive 
condition, as shown in Figure AIII.l. The fluctuation intervals of the delay time 
A were fixed at 24 ms (Motif A) and 30 ms (Motif B), respectively. Results show 
that the preferred delay times of single reflection [ Arj ] /7 are shortened as 1 19 ms 
(Motif A) and 34 ms (Motif B). These may be caused by the fact that the maximum 
delay time of single reflection with fluctuation is At\ + A /2. 
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(The number in parentheses signifies the relevant equation.) 
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IACC 

IALD 

IATD 
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Total amplitude of reflections, (4.6). 

Amplitude of the slow vertex response (SVR). 

Pressure amplitude of the nth reflection determined by the 
(1/r) law, A 0 being unity, (3.15). 

Speed of sound in air [m/s], (3.16). 

Constant to determine the preferred initial time-delay gap 
between the direct sound and the first reflection, (6.2) 
which is applied for any subjective responses. 

Percentage of violation, (9.9). 

Distance between the source and the observation point for 
each reflection, n — 1, 2, 3, . . . See (3.15). 

Frequency [Hz]. 

Sound pressures at the left and right ear entrances, (3.14). 

Visual image at the left and right retina, (12.5). 

Number of sound fields, (9.6), (9.9). 

Function of any subjective responses in relation to physical 
factors, (4.8), (12.1). 

Scale values of subjective preference in relation to the left- 
and right-hemispheric factors, (12.2)— (12.4). 

Impulse responses of a room, (3.13). 

Scale values of orthogonal factors, / = 1, 2, 3, 4, (4.9). 

Impulse responses between the sound source and the left and 
right ear-canal entrances in a free field, (3.13); Impulse 
responses between the light source and each position of the 
left and right retina, (12.5). 

Magnitude of the interaural cross-correlation function, (2.3), 
(3.26). See also Figure 3.7. 

Interaural level difference, Section 3.4.2. 

Interaural time difference, Section 3.4.2. 

Constant to determine preferred initial time-delay gap 
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between the direct sound and the first reflection, (6.2) 
which is applied for any subjective responses. 

L eq Equivalent sound level, (3.8). 

N w Period of diffuser. See Figure 8.7. 

N m Latencies at the rath maxima of the SVR, ra = 1, 2, 3. 

p Significance level. 
p(t) Source signal, (3.2), (3.3). 

P Physiological magnitude corresponding to the IACC, (5.2). 

P(i > j) Probability that /is preferred to j, (9.6). 

P(co) Fourier Transform of p(t), (3.2). 

Pd(co) Power density spectrum, (3.1). 

P m Latencies at the rath minima of SVR, ra = 1, 2, 3. 
r Correlation coefficient. 

ro Position of the sound source, ro = (xq, yo, zo). 

Ro Position of the light source, Ro = (xo, yo, zo), (12.5). 

5 = IACC, (6.5). 

s Integer. 

s(t) Impulse response of the A- weighting filter corresponding to 
ear sensitivity; f { ' r (t) = fi, r (t) * s(t) in (3.22). 

S Total scale value of preference, (4.9) or the total scale value 
of a single subjective response, (12.1). 

S Total surface in a room, (3.20). 

SCC Short-term cross-correlation coefficient. See Table 2.1. 

Si Scale values of preference as a function of the listening 

level, initial time delay gap between the direct sound and 
the first reflection, subsequent reverberation time, and IACC, 
respectively, / = 1, 2, 3, 4, (4.9) and (4.10). 

SD Scale of dimension of a concert hall, At\ = 22 (SD), (4.2). 

SI Speech intelligibility [%], (6.5). 

t = STI, (6.5). 

t Time [s]. 

T Time interval [s]. 

T Time factor of reflector, (12.5). 

Too Reverberation time defined by Sabine, (2.2). 

T sub Subsequent reverberation time defined by the decay rate to 
decrease to 60 dB just after early reflections [s]. 

[T sub ] p Calculated preferred subsequent reverberation time [s], (4.7). 
w n ( t ) Impulse response describing reflection properties of 

boundaries, n = 1, 2, 3, ... , (3.13), (12.5). 

IEiacc Width of the IACC or width of the interaural cross- 
correlation function at the tiacc, as defined in 
Figure 3.7. 

Z ab Scale value obtained by the probability, (9.2). 

a Averaged absorption coefficient, (3 .20), (3.21). 
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Weights of the scale value of preference for each orthogonal 
factor, / = 1, 2, 3, 4, (4.10). 

Dirac delta function. 

= (d\ - d 0 )/c [s]; Initial time-delay gap between the direct 
sound and the first reflection. 

The most preferred delay time of the first reflection for alto- 
recorder soloists, (7.1). 

Delay time of the nth reflections relative to the direct 
sound [s], (3.17). 

Calculated preferred initial time-delay gap between the 
direct sound and the first reflection [s], (4.1), (4.3), (4.5); 
see also the equation between (4.6) and (4.7). 

= 2tt(/ 2 -/i),(3.28), (6.3). 

= 27t(/2 + /,),(3.28), (6.3). 

Normalized interaural cross-correlation function, (3.23). 

Normalized ACF, (3.7). 

ACFs at the origin of time corresponding to averaged sound 
energies at the left ear and the right ear, respectively, 
(3.23), (3.24). 

Interaural cross-correlation function, (3.22). 

Autocorrelation function (ACF), (3.4). 

Elevation angle to a listener. 

Poorness of fit for the model, (9.7). 

Time delay [s]. 

Time delay [s]. 

Summation. 

Time delay [s]. 

Effective duration of the ACF, defined by the delay time at 
which the envelope of the normalized ACF becomes 0.1 
(the ten percentile delay) [s]. 

Interaural delay time at which the IACC is defined in 
Figure 3.7. 

Most preferred delay time of the first reflection, 

[ Af i = r e , (4.1). 

Angular frequency, co — 2i if [rad/s]. 

Horizontal angle to a listener. 


Abbreviations 

ABR Auditory brainstem response. 

ACF Autocorrelation function. 

AEP Auditory evoked potential. 

ASW Apparent source width. 

BESTI Best ear STI, (6.6). 
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CBW 

Continuous brain wave. 

EDT 

Early decay time. 

IACC 

Interaural cross-correlation. 

IALD 

Interaural level difference. 

IATD 

Interaural time difference. 

JND 

Just noticiable difference. 

LL 

Listening level measured by dB(A) 

REM 

Rapid eye movement. 

STI 

Speech transmission index. 

SVR 

Slow vertex response. 
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